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Abstract 

When beneficial mutations are rare, they accumulate by a series of selective sweeps. 
But when they are common, many beneficial mutations will occur before any can 
fix, so there will be many different mutant lineages in the population concurrently. 
In an asexual population, these different mutant lineages interfere and not all can 
fix simultaneously. In addition, further beneficial mutations can accumulate in 
mutant lineages while these are still a minority of the population. In this paper, 
we analyze the dynamics of such multiple mutations and the interplay between 
multiple mutations and interference between clones. These result in substantial 
variation in fitness accumulating within a single asexual population. The amount 
of variation is determined by a balance between selection, which destroys variation, 
and beneficial mutations, which create more. The behavior depends in a subtle 
way on the population parameters: the population size, the beneficial mutation 
rate, and the distribution of the fitness increments of the potential beneficial mu- 
tations. The mutation-selection balance leads to a continually evolving population 
with a steady-state fitness variation. This variation increases logarithmically with 
both population size and mutation rate and sets the rate at which the population 
accumulates beneficial mutations, which thus also grows only logarithmically with 
population size and mutation rate. These results imply that mutator phenotypes 
are less effective in larger asexual populations. They also have consequences for 
the advantages (or disadvantages) of sex via the Fisher-Muller effect; these are 
discussed briefly. 
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INTRODUCTION 



The vast majority of mutations are neutral or deleterious. Extensive study of such muta- 
tions has explained the genetic diversity in many populations, and been useful for inferring 
population parameters and histories from data. Yet beneficial mutations, despite their rarity, 
are what causes long-term adaptation. Unfortunately, our understanding of their dynamics 
remains poor by comparison. Beneficial mutations also alter the distributions of neutral 
and deleterious mutations: in asexual populations, as well as in regions of a sexual genome 
that remain linked on the relevant timescales, positive selection can dramatically affect the 
genetic diversity. 

When beneficial mutations are rare and selection is strong, positive selection results in a 
succession of selective sweeps. A mutation occurs, spreads through the population due to 
selection, and soon fixes. Some time later, another such event may occur. This situation is 
sometimes called the strong selection weak mutation regime — to make its character clear, 
we will refer to it as the successional-mutations regime: between sweeps, there is a single 
"ruling" population. In this regime, the effect of positive selection on patterns of genetic 
variation is reasonably well understood. A selective sweep reduces the genetic variation in 
regions of the genome linked, over the timescale of the sweep, to the site at which a beneficial 
mutation occurs: other mutations in these regions hitchhike to fixation. 

Successional-mutations behavior typically occurs in small to moderate sized populations 
in which beneficial mutations are sufficiently rare. However, a different regime occurs in 
larger populations, in which beneficial mutations occur frequently. When beneficial muta- 
tions are common enough that many mutant lineages can be simultaneously present in the 
population, selective sweeps will overlap and interfere with one another (i.e. different bene- 
ficial mutations will grow in the population concurrently). If, in addition, selection is strong 
enough that it is not dominated by random drift (except while mutants are very rare), we 
have a "strong selection strong mutation" regime. For clarity, we will refer to this as the 
concurrent-mutations regime. The effects of concurrent mutations in asexual populations are 
the focus of the present paper. As we will see, the concurrent beneficial mutations regime is 
not an unusual special case: many viral, bacterial and simple eukaryotic populations likely 
experience evolution via multiple concurrent mutations. 
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In populations which contain many different beneficial mutants, there will be substantial 
variation in the fitness within the population. This variation will be acted on by selection. 
But in the absence of new mutations, the variation will soon disappear. Thus the traditional 
approach to evolution of quantitative traits — to assume that there always exists genetic 
variation (as for traits not subject to selection) — fails badly. New mutations are crucially 
needed to maintain the variation on which further selection can act. Thus to understand 
adaptation when multiple mutations are involved, it is essential to analyze the interplay 
between selection and new beneficial mutations, especially how the latter maintains the 
former. Understanding this beneficial mutation-selection balance and the resulting dynamics 
is the primary goal of this paper. 

Both the successional- and concurrent-mutations regimes require that selection dominates 
drift except while mutants are very rare. A qualitatively different regime occurs with weakly 
beneficial mutations: these do not sweep in the traditional sense because drift dominates 
their dynamics. This weakly beneficial regime most readily occurs in small populations, 
where selective forces cannot overcome drift, or when co nsidering mutations of very small 



effect, such as those that affect synonymous codon usage 



McVean and Charlesworth 



2000 : 



Comeron et al. 



Przeworski et al. 



But when populations 



19991 : 



1987 : 



are large, selection of beneficial mutations of fitness increment s will tend to dominate over 
drift (unless the beneficial mutations are extremely rare). In this paper we are interested in 
beneficial mutations in large populations, thus we focus exclusively on the strong selection 
regimes for which drift is only important for beneficial mutant lineages while they are a tiny 
minority of the population. 

The essential difference between the successional-mutations and concurrent-mutations 
regimes is presented in Fig. 1, which depicts beneficial mutations in an asexual population. 
In a small enough population, or one whose beneficial mutation rate ([/&) is low, beneficial 
mutations occur rarely enough that they are well separated in time and one can sweep before 
another arises (Fig. la). This is the successional-mutations regime, in which the beneficial 
mutations all behave independently. However, in a larger population or at higher Uj,, mul- 
tiple mutant populations exist concurrently and they are no longer independent (Fig. lb). 
Mutations that occur in different lineages cannot both fix in the absence of recombination: 
at least one of them must be "wasted". In this paper, we focus on understanding how an 
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asexual population in the concurrent-mutation regime accumulates beneficial mutations. 

In the concurrent mutations regime, two important effects occur. The first is when a 
moderately beneficial mutation occurs and begins to sweep, only to be outcompeted by 
a later, more strongly beneficial mutation that occurs in a wild-type individual (i.e. an 
individual of the majority population). The first mutation is then wasted as it is eliminated 
along with the then-majority type by the sweep of the stronger mutation. This effect is 
referred to as clonal interference; it is illustrated in Fig. lc. Note that (despite earlier 
broader definitions) we will use the term "clonal interference" to refer to only this first effect, 
consistent with the focus of recent work on the subject. The second effect is when multiple 
mutations occur in the same lineage before the first beneficial mutation fixes. For example, 
a second (moderate-effect) beneficial mutation can occur in an individual that already has 
the one beneficial mutation. The double mutant can then benefit from the combined effect 
of the two mutations and outcompete the single mutant as well as some other stronger 
single-mutants that arise in the majority population. This process is illustrated in Fig. Id. 

The dynamics of evolution in the concurrent-mutations regime is important to under- 
stand. At the very least, this is essential for forming sensible null expectations about ex- 
perimental, observational and genomic data from large populations, which — after all - 
are generally evolving. Knowing how the effects of beneficial mutations depend on mutation 
rate and population size is crucial for making meaningful comparisons between different 
populations. Most important, in our view, is developing an intuition for how large popula- 
tions evolve. The simple picture of successive selective sweeps in the successional-mutations 
regime is a valuable guide to thinking about positive selection. Yet we have little intuitive 
guidance when the successional-mutations approximation does not apply. This is a serious 
shortcoming in our understanding of the evolution of a wide array of populations, including 
viruses and most unicellular organisms. 

Although it is not as well understood as the successional-mutations regime, the 
concurrent-mu tation s regime h as be en the subject of substantial interest since the 1930s. 



FisherI ( 119301 ) and IMullerI (119321 ) first noted the potential importance of interference 
between beneficial mutations (Muller drew diagrams very similar to our Fig. 1). They pro- 
posed what has come to be known as the Fisher-Muller hypothesis for the advantage of sex: 
sexual populations can recombine beneficial mutations in competing lineages into the same 
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individual. This prevents mutational events from being wasted, as they often are in asexual 
populations. 

Much subsequent work on positive selection in the co ncurrent-mutations regim e 
has focused on the implic ation s for the evolut ion of sex. IMaynard Smith (Il97ll ). 



Crow and KimuraI (119651 ). and lBoDMERl ( 11 9 701 ) attempted to quantify the Fisher- Muller 
effect in the late 1960s and early 1970s. However, their analysis was incomplete — it did 
not fully account for stochastic behavior, ignored triple and higher mutations, and did not 



correctly account for the effects of sex. Contemporaneously, IHill and RobertsonI ( 119661 ) 
looked at this problem from the perspective of the linkage disequilibrium generated by mul- 
tiple linked beneficial mutations segregating simultaneously. This has become known as the 
Hill- Robertson effect. It i s esse ntiall y equivalent to the F i sher- M uller effect and the analy- 



sis of 



Maynard Smith (Il97ll ) and 



Crow and Kimura ( 



1965T) (see IFelsensteini ( 1974 ) 



for a detailed discussion) . In recent years, IBartoni ( 119951 ); IBarton and QttoI ( 120051 ) 



and 



Otto and Barton! ( 119971 . 120011 ) have analyzed the Fisher-Muller effect from the Hill- 



Robertson perspective. Their work focuses on the buildup of linkage disequilibrium due to 
mutations and selection, and the average effect of recombination on the variance in fitness 
and the destruction of disequilibrium. This provides useful insight into the effects of sex, 
but does not explain the full evolutionary dynamics or population genetic structure created 
by this type of positive selection. 

In this paper, we step back from the long tradition of studying the implications of con- 
current mutations for the evolution of sex and focus instead on the basic dynamics shown 
schematically in Fig. lb. We show, both heuristically and quantitatively, how an asex- 
ual population in the concurrent-mutations regime accumulates many beneficial mutations, 
what the fitness distribution looks like, how it develops, and how quickly selected substi- 
tutions occur via collective sweeps. We develop a framework for thinking more generally 
about positive selection and its effects that is applicable to large populations of asexuals or 
any other case where linkage between mutations is important. 

We do not analyze the questions about sex or patterns of diversity in this paper. However, 
these questions should be informed by our results; some can be studied within the framework 
we present in this paper. For example, when recombination is uncommon, the average 
effects of sex may be irrelevant — instead all that matters is whether or not it creates rare 
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individuals that are much more fit than the majority of the population. To study this, 
we must first understand the full distribution of genetic diversity within the population. 
Similarly, before analyzing the patterns of genetic variation exhibited by populations in 
which multiple linked beneficial mutations have occurred — or are occurring — one must 
understand the rate of beneficial substitutions and typical interference patterns between 
these within the linked regions. 

To understand the concurrent-mutations dynamics in detail, it is essential to start with 
a specific model that focuses on some subset of the important effects. Features can then be 
added after enough understanding has been gleaned to enable predictions of which effects are 
model-specific and which are more general. Positive selection can involve various complica- 
tions, including epistasis (interactions between effects of mutations), conditionally beneficial 
mutations, frequency- dependent benefits, and changing environments, among others. Many 
different scenarios are possible. At present we have little understanding of which, if any, 
of these situations are biologically "typical," and which are unusual. In this paper, we do 
not attempt to catalogue all possible complications; this is an impossibly broad subject. 
Instead we look at the simplest possible situation involving positive selection of concurrent 
mutations. We suppose that a variety of beneficial mutations are available to a population, 
and ask how the population acquires them. We assume these mutations interact in a simple 
multiplicative way (additive for the growth rates) with no epistasis, frequency dependence, 
or changing environment of any kind. In short, we ask how the population climbs a single 
smoothly sloped "hill" in fitness space (Fig. 2). 

This simple scenario is probably common. Populations often find themselves in an envi- 
ronment where they can accumulate quite a few different beneficial mutations which each 
independently (or at least roughly so) help them adapt. Even when this simple hill-climbing 
scenario does not apply, it is an important null model. Some more complex forms of positive 
selection may also prove tractable within the framework we describe, while others will not; 
these leave open many avenues for future work. 

Various other authors have studied the dynamics of multiple concurrent be n eficia l 



mutations under the simple assumptions outlined above. 



Gerrish and Lenski 



(|1998h 



analyzed "clonal interference" betw een mutations of different strengths: this ha s since 



been extended by various authors (ICampos and De OliveiraI . 



2004; 



Gerrish 



2001 



8 



Johnson and Barton 



2002 



Kim and Stephan 



2003 



Orr 



2000 



WilkeI . l2004h . This 



work focuses on the interference between mutations of different strengths that occur in the 
same lineage, while neglecting the competition between mutations that arise in different 
lineages — in particular multiple mutants. Our analysis in the present paper starts instead 
with the other concurrent-mutation effect, multiple mutants, initially in a model in which 
clonal interference is absent. In any real situation, the two effects will both occur. We thus 
discuss the interplay between clonal interference and multiple mutations in a later section. 



The detailed analysis wil 
few of the salient results. 



be presented e 



sewh ere: in the present paper we summarize a 



Kim and OrrI (120051 ) have also recently analyzed a model which 



combines some aspects of clonal interference and multiple mutations. At this point, it is im- 
portant to note that if population parameters are such that clonal interference is important, 
the effects of multiple mutants are usually at least of comparable importance. Thus there 
is some inconsistency in focusing on clonal interference alone. 

To focus on the effects of multiple mutants without clonal interference, two additional 
simplifying approximations are useful. For most of this paper, we study a model in which 
each beneficial mutation has the same effect s on fitness (i.e. each step uphill is of the 
same size). Furthermore, to focus on the effects of positive selection, we neglect deleterious 
mutations in the primary analysis. Even though neither assumption will typically be true, 
these turn out to be reasonable approximations in many circumstances. Situations in which 
they are not appropriate are more complicated scenarios for positive selection, some of which, 
especially the effects of a distribution of fitness increments, we discuss briefly. 



Remarkably, even the simplest possible mo del with many equal-st rengt 

(119971 ) and 



i beneficial mu- 



Kessler et al. 



RlDGWAY et al. 



tations available is only partially understood. 

(119981 ) analyzed a similar simple model, but their initial work did not handle random drift 
correctly. More recently, they have developed a sophist icated although somewhat unwieldy 
moment-based approach (IKessler and LevineI . I2003T) from which it i s unfo rtunately hard 

Rquzine et all (120031 ) also studied a 



to understand the essential aspects of the dynamics, 
model similar in its essential aspects to our simplest model (although also including dele- 
terious mutations of the same magnitude). They were concerned with viral evolution, and 
their results are primarily valid for very large mutation rates appropriate for many viruses. 
Nevertheless, if worked out more fully from Rouzine et. al.'s analysis, several results can be 
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obtained that are closely related to ours. But our analysis involves a less mathematically 
formal approach — we believe it is both clearer and a better basis for further develop- 
ment (some of which is included herein). We focus in this paper on a different regime from 
(IRouziNE et all 120031 ): it should obtain in single celled organisms (and so me viruses). We 



Rquzine et al. 



discus s in detail, below, the relationship between our analysis and that of 
(120031 ). 

The outline of this paper is as follows. We begin by describing in the next section a 
heuristic approach to the dynamics. This analysis gets the behavior roughly correct, and 
illustrates the ideas underlying our approach. We then describe the simplest model more 
precisely, and analyze it the following section. We next discuss transient behavior before 
the population has reached its steady state fitness distribution, and address the effects of 
deleterious mutations. In the next section, we make comparisons between our analytic results 
and simulations. We then relax our assumption that all mutations have the same effect, and 
discuss the relationship between our theory and clonal interference analysis. Finally, we 
summarize our results and discuss future directions. 



HEURISTIC ANALYSIS AND INTUITION 

In the simplest situation with multiple concurrent beneficial mutations available, there 
are three important parameters: the population size, N, the beneficial mutation rate per 
individual per generation, [/&, and the fitness increase s provided by each mutation. We will 
refer to the basic exponential growth rate, r, of a population as its fitness (rather than its 
growth factor per generation R = e r ~ 1 + r). Thus two mutations of magnitude s± and S2 
increase fitness by s\ + S2 (in the absence of epistasis, which we will generally assume). We 
call the rate of increase, d(r)/dt, of the average fitness of a population the speed of evolution 
and denote it v. 

To focus on the effects of multiple mutants in a situation in which clonal interference 
does not occur, we initially restrict consideration to the approximation that all beneficial 
mutations have the same effect. A fc-tuple mutant thus has fitness ks greater than the 
original wild-type. The speed of evolution is then simply v = s(%). 

We begin by reviewing the successional-mutations regime where beneficial mutations are 
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sufficiently separated in time for them to sweep independently, as in Fig. la. Although 
this is exactly analyzable and well known, it is instructive to consider it from a heuristic 
perspective. We then turn to a heuristic analysis of the more complex concurrent-mutations 
dynamics illustrated in Fig. Id. 

Successional-Mutations Regime and the Establishment of Mutants 

Small asexual populations evolve by accumulating beneficial mutations sequentially. Ben- 
eficial mutations occur in the population at a total rate NUj,. If its selective advantage s is 
in the range ^ C s < 1, the probability that a particular mutant will survive random drift 
is proportional to s (the constant of proportionality depends on the specific model for the 
stochastic dynamics; for our model it is 1 and we discuss in the Model section below the 
minor modifications of our results that are needed for other stochastic dynamics). We call 
the process by which the lineage of a beneficial mutant that survives drift becomes large 
enough for the population of its descendants to grow deterministically the establishment of 
the mutant clone. Thus new beneficial mutations are established at a rate NUbS per gener- 
ation (other mutant populations die out due to random drift) so that a new mutation will 
become established about once every T esta biish = N jj bS generations. As we will show later, 
a mutant population typically becomes established when its size is of order -. Thus once 
it has become established, the mutant takes of order Tf ix = Mn[JVs] generations to fix (we 
will loosely call "fixed" a mutant lineage that has grown to represent a large fraction of the 
population; the conventional definition corresponds to fully fixed, which takes about twice 
as long). 

When the population size or mutation rate is small enough, fixation will happen more 
quickly than establishment. This occurs when 

\n[Ns] 1 

Tfix ~ <^ T es tablish ~ ATTT ; (1) 

s NU b s 
which corresponds to NUb <C i n []vs] • When this condition holds, we are in the successional 
mutations regime, in which the establishment rate is limiting: a mutation A that arises and 
fixes will do so long before the next mutation destined to survive drift, B, is established. 
Thus a relevant mutation B occurs in a population that has already fixed A, yielding AB, 
and B fixes well before mutation C is established as mutant ABC. Beneficial mutations 
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continue to accumulate in this simple way. New mutations arise and fix at average rate 
NUbS, each one increasing the fitness by s. Thus fitness increases at a speed 

v = NU b s 2 , (2) 

linear in the product NUt,. This linear mutation-limited behavior characterizes the 
successional-mutations regime of successional selective sweeps. 



Concurrent-Mutations Regime 



In larger populations, the behavior is more complex, as illustrated by Fig. Id. In this case, 
the establishment times of new mutants are shorter than their fixation times, corresponding 
to 



NU h > 



(3) 



InfiVs] ' 

Thus new beneficial mutations arise and become established before earlier ones can sweep, 
causing them to interfere with one another. 

Two types of interference are important. Recent clonal interference analyses 
have focused on one: the competition that occurs when two mutations which 



have different strengths occur inc 


ependently in individuals with similar initial fit- 


ness (Campos and De Oliveira. 


2004: Gerrisp 


. 2001: 


Gerrish and Lenski. 


1998: 


Johnson and Barton. 2002: Kim and Orr. 2005: 


Kim and Stephan. 2003: Orr 


2000: 


WilkeI. 2004). We focus in the bulk of this paper on the other tvpe of interference: a 



mutation that arises in a fitter background — e.g. one with an earlier beneficial mutation 
- will outcompete another mutation of similar effect that occurs in a less fit background. 
In the constant-s model clonal interference is explicitly absent, and we thus initially focus 
exclusively on this latter effect. In this constant s approximation, two different mutants 
that occur among those with the same fitness (in particular members of the same clone) 
will compete equally and sweep together, each becoming only partially fixed. Unless we 
are interested in the neutral genetic variability of the population, all subpopulations with 
the same fitness can be considered as a single subpopulation: we will do this except in the 
discussion at the end of this paper. Also, we postpone discussion of the interplay between 
clonal interference and multiple mutants (i.e. going beyond the constant s model) to a later 
section below. 



12 



First consider starting from a monoclonal population. Mutations initially give rise to 
a subpopulation with fitness increased by s (Fig. 3a). The size of this mutant subpopu- 
lation drifts stochastically, but eventually becomes large enough, roughly - individuals, to 
become deterministic. This takes a (stochastic) establishment time, t%. After its establish- 
ment but before its fixation, mutations can occur in the still-small mutant subpopulation 
to create double mutants with fitness 2s (Fig. 3b). This typically happens well before the 
single mutants have fixed (else we are by Eq. ([I]) in the successional-mutations regime). A 
double-mutant population thereby becomes established a time T2 after the establishment of 
the single-mutant population. Triple mutants then begin to arise, and become established 
after an additional time T3. This interval is typically shorter than T2 primarily because 
double-mutants grow faster than single-mutants and hence generate more mutations, and, 
in addition, because the triple-mutants are more fit than double-mutants and hence survive 
drift more easily (with probability 3s rather than 2s). 

This process continues, accelerating at each step. Eventually, however, enough time 
passes that the single-mutant subpopulation (or one of the multiple-mutant subpopulations) 
becomes larger than the original wild-type. This near-fixation of the single-mutants increases 
the mean fitness by s, which balances the accelerating front and creates a moving fitness 
distribution which will attain a (roughly) steady state width with the mean fitness increasing 
with a steady state average speed, v. This is a form of mutation-selection balance: as each 
new beneficial mutation becomes established, the mean fitness increases by s and the fitness 
distribution moves to higher fitness while maintaining the same shape. 

It is useful to consider this process in more general terms. The key to the behavior is the 
balance between mutation, which increases the variation in fitness within the population, 
and selection, which decreases the variation by eliminating all but the fittest individuals. If 
we were discussing deleterious mutations, mutation would also oppose the tendency of selec- 
tion to increase the mean fitness, leading to a steady-state distribution of fitness (ignoring 
Muller's Ratchet, which for large populations only matters on extremely long timescales). 
This deleterious mutation-selection balance, which i s independent of population size for 
large N, has long been understood (IGillespieL Il998l ). In our case, the dynamics are more 
subtle because the important mutations are beneficial. The basic idea of mutation-selection 
balance, however, is unchanged. Mutations broaden the fitness distribution while selection 
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narrows it, creating a steady state variance around an increasing mean fitness. But unlike 
the deleterious case, the dynamics of the rare individuals near the most-fit tail of the fitness 
distribution (the "nose") control the behavior. Selection moves the distribution towards 



higher fitness at a rate very close to the steady state variance in fitness — the c 



suit i n the absence of mutations (the "fundamental theorem of natural selection") ((Fisher, 



assic re- 



19301 ). But new beneficial mutations at the nose are essential to maintain this variance: 



in their absence the fitness distribution would collapse to a narrow peak near the most-fit 
individual and evolution would grind to a halt. 

The crucial dependence on new mutations in the nose makes the analysis of the beneficial 
mutation-selection balance more complex than in the deleterious case. It is now essential 
to account properly for random drift in the small populations near the nose. In the case 
of deleterious mutation-selection balance, rare new mutants are less fit than the rest of 
the population. They will die out soon anyway, so failing to account properly for the 
stochastic dynamics by which they do so has no serious consequences. Random drift is 
only important with solely deleterious mutations if Muller's ratchet is operating, i.e. if 
the most-fit individuals are rare enough that they can die out due to random drift. The 
beneficial mutation-selection balance is quite analogous to this Muller's ratchet case. Here 
too the subpopulations that are more fit than average control the long-term behavior of 
the population, and these are small enough that correct stochastic treatment is essential. 
As is the case with Muller's ratchet, infinite- N deterministic approximations are not even 
qualitatively correct. Indeed, with a large supply of beneficial mutations, deterministic 
analysis incorrectly predicts a rapid acceleration of the nose towards an infinite speed of 
evolution. This nonsense result is because of the creation in the deterministic approximation 
of (what are effectively) fractional numbers of new much fitter mutants which then grow 
exponentially, unhampered by drift, and dominate the behavior soon after (we describe this 
in more detail in Appendix A). 

There are two factors that determine the dependence of the speed of evolution on the 
population size. The first is the dynamics of already established populations, which is domi- 
nated by selection. The second is the new mutations that occur in the fittest subpopulation. 
We define the lead of the distribution, Q, as the difference between the fitness of the most-fit 
individual and the mean fitness of the population (more precisely, Q — s is the difference 
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between the mean fitness and that of the most-fit established mutant class). We define q by 
Q = qs, so that if the lead is Q, the most-fit individuals have q more beneficial mutations 
than the average individual: they have a "lead" Q in the race to higher fitness. Once it is 
established, the fittest population grows exponentially, first at rate qs but more slowly as 
the mean advances. Growing from its establishment upon reaching size ^ until it reaches 
a large fraction of N will thus take time ln(A^gs)/(y), since y is the average growth rate. 
In this time the mean fitness will increase by qs. Therefore v pa (qs) 2 /[2 hn(Nqs)]. One 
can show that this v is equal to the variance in fitness, as expected if mutation is indeed 
negligible compared to selection in the bulk (i.e. away from the nose) of the distribution, so 
that the fundamental theorem of natural selection applies. 

The other factor is the dynamics of the nose, where mutations are essential. A more- 
fit mutant that moves the nose forward by s will be established some time r q after the 
previous most-fit mutant. Thus the nose advances at a speed v = s/(r q ), where (r q ) is 
the average r q . After it is established, the fittest population n q will grow exponentially at 
rate qs and produce mutants at a rate Ubn q pa Ube qst /qs. Many new mutants will occur 
soon after the time at which Uj, J n q (t)dt becomes equal to one, so the time it takes a new 
mutant to establish is r q ~ ln(s/C4). This means the nose advances at rate t/~s/ (r q ) pa 
qs 2 /\n(s/Ub>- Significantly, the behavior of the nose depends only on mutations from the 
most-fit subpopulation; it is almost independent of the less fit populations and thus can 
depend on iV only via the lead, qs. As far as the nose is concerned, the majority of the 
population — destined to die out shortly — is important only to ease the competition for 
the fittest few. Yet we argued above that the bulk of the population fixes the speed of the 
mean via the selection pressure: v pa (qs) 2 /[2 In(Nqs)]. In steady state, the speed of the 
mean must equal the speed of the nose — the mutation-selection balance. This implies that 

2\n[Ns] 



* In [s/U b ] 

and 



(4) 



2g 2 hi[iV s ] 

v ~ 2 • (5) 

In [s/U b \ 

These results are very close to the more careful calculations below. All the basic qualitative 
behavior follows from this intuitive reasoning. 

For large NUb, we have found that v depends logarithmically on iV and U b , much slower 
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than the linear dependence on NUb which holds for smaller populations. This reduction 
occurs because at large NUb, almost all beneficial mutations occur in individuals far from 
the nose of the fitness distribution (i.e. in a bad genetic background) and are therefore 
"wasted," since these subpopulations are doomed to extinction. Thus increasing N does 
not directly increase the supply of important mutations, as these occur in the relatively few 
individuals at the nose. Rather, the effect of increasing N is to increase the time required 
for selection to move the mean fitness, which increases the lead, which makes individuals 
at the nose more fit relative to the mean fitness, which speeds the establishments at the 
nose. Similarly, increasing Ub does not directly affect the dynamics of most of the fitness 
distribution. Rather, it decreases the time for new mutations to occur at the nose, which 
means that more mutations can occur before the mean moves, which increases the lead and 
speeds the evolution. 

This also explains why v is not a function of NU b : N directly affects only selection 
timescales, while U b directly affects only the mutation supply rate, so v depends on N and 
Ub separately. It is not a function of the commonly used parameter 9 = 2NUb- Instead, it 
is a function of the basic parameters Ns (which describes selective forces) and jj~ (which 
describes the strength of selection relative to mutation), and it is valid in the regime where 
both these are large. The expression for q above is of order the basic selective timescale, 
Mn[iVs] divided by the basic mutation timescale, ^ hif^], which makes sense since the lead 
is set by the balance between these two forces. The two factors that determine the basic 
time scales of the multiple mutation dynamics are 

L = \nNs and £ = \ns/U b . (6) 

Although these are both logarithmic in the population parameters and thus never huge, 
they can be large enough to be considered as large parameters. Many of our more detailed 
results are valid in the limit that both L and £ are large, with corrections (some of which 
we include) smaller by powers of \j£ or 1/L. 

We will show below that our result for v is consistent with the fundamental theorem of 
natural selection, which states that when mutational effects can be ignored the speed of 
evolution is equal to the variance in fitness within the population. Viewed in this light, 
our result for the speed of evolution is not in itself novel: the speed is just the variance in 
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fitness, as usual. What our analysis does is to obtain what this variance is. In many aspects 
of quantitative genetics, the variance of a quantitative trait (such as fitness here) is taken 
as some external parameter. When the variance has accumulated during a period when it 
was neutral, and is only starting to be selected on, this may be appropriate. But beyond 
that, it is surely not. Our analysis deals with the case when variance is accumulating while 
being selected on. That is, when variance in fitness is increasing due to mutations while at 
the same time it is being acted on by selection. Then, even if the adaptation speed is only 
indirectly related to new mutations, it is essentially dependent on them: without mutations 
the variance will rapidly collapse to zero. We analyze here how a balance between the 
forces of mutation and selection develops to set a steady-state variance, and how large that 
variance is. 

However, neither our heuristic analysis above, nor our more careful work described below 
ever explicitly involves the fitness variance. Rather, the natural measure of the width of 
the fitness distribution is the lead. It is the lead, not the variance or standard deviation, 
that can be most productively thought of as a balance between mutation and selection. It is 
true, of course, that the variance is also increased by mutation and decreased by selection. 
However, this is not the clearest way to understand the behavior. The increase in the 
variance from mutations is delayed and indirect. The new mutations that occur at the nose 
will only increase the variance after they have grown enough — and by then the important 
new mutations that will keep the variance high later are happening further out in the nose. 
This is not to say that a variance based — together, crucially, with higher moments - 
approach is impossible, but it is unwieldy and prone to hard-to-understand errors when 
any approximations are made. We discuss the problems with moment based approaches in 
Appendix A. 

SIMPLEST MODEL 

We now turn away from crude (although powerful) intuitive arguments towards more 
rigorous analysis. We begin this section by defining the simplest model more precisely. We 
consider mutation, selection, and drift within a purely asexual population of constant size N. 
We assume that a large number of beneficial mutations, each of which increases the fitness by 
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s, are available and define Uj, to be the total mutation rate to these mutations. We consider 
the situation where the number of beneficial mutations fixed is small compared to the total 
number available so that £4 does not change appreciably over the course of the evolution 
(we relax this assumption in Appendix C). We neglect deleterious mutations and other- 
strength beneficial mutations (see later sections below for a discussion of the consequences 
of these assumptions). These simplifications are not essential and do not change the basic 
behavior in many situations, Indeed, we will argue that these assumptions can all be good 
approximations even when the situation is more complex, in particular when iV or C/j, are 
not constant, or in the presence of deleterious mutations or variable s, as we discuss in 
detail in subsequent sections. But, more importantly, these simplest approximations make 
the analysis clearer. 

In addition to the more innocuous simplifications, we make two essential biological as- 
sumptions: that there is no frequency- dependent selection, and that there is no epistasis, 
so that the fitness of an individual with k mutations is (k — £)s greater than the fitness 
of an individual with £ mutations. When either of these conditions fails, the evolutionary 
dynamics can be very different from our predictions. Note, however, that our model can 
sometimes be a good approximation even in the presence of epistasis: the simplifying feature 
is the assumption that after one or more beneficial mutations have already been acquired, 
the distribution of available beneficial mutations in the new genetic background is similar 
to that in the "wild-type" background, but these need not be the same mutations as those 
available initially. 

Key Approximations 

There are two primary difficulties in analyzing the multiple subpopulations that occur 
even in the simplest model. The first is the stochastic aspects: when a subpopulation with 
a given fitness is small, stochastic drift plays a crucial role and must be handled correctly. 
The second is the interactions between the subpopulations: the constraint of fixed total 
population size means that there is effectively a frequency dependence to the growth of a 
subpopulation — albeit a simple one. 

To model the stochastic effects, we assume that the basic process of birth and death is 
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a continuous-time branching process. All individuals have the same constant death rate 
1, which ensures that the average lifetime of an individual is 1 (i.e. the units of time 
are generations), and the lifetimes are exponentially distributed. Each individual in the 
population has some number, y, of beneficial mutations. We define y to be the average value 
of y across the population (i.e. the average number of beneficial mutations per individual). 
An individual with y beneficial mutations has a birth rate 1 + (y — y)s. This ensures that 
the average birth rate in the population is 1, so the population stays at a constant size N. 
We assume all individuals give rise to mutant offspring at rate independent of their birth 
rate (i.e. mutants arise at a constant rate per unit time). If mutations instead occur at 
a constant rate per birth event, our assumption underestimates the mutation rate for the 
most-fit individuals. However, we always assume (y — y)s <C 1 for all individuals (i.e. the 
lead, Q, is 1), so that the two definitions are almost equivalent. 

The branching process model allows one to calculate simple analytic expressions for a 
number of important quantities which are not readily available in diffusion approximations 
of the standard Wright-Fisher model. However, branching process models cannot easily deal 
with the nonlinear saturation effects required to maintain a constant population size. By 
"saturation" effects, we refer to when a mutant subpopulation has become large enough 
to influence the mean fitness of the population, and hence begins to compete with itself, 
slowing its growth: this is the essential effect of the fixed total population size. To handle the 
saturation effects, we make use of a simple observation: stochastic effects are only important 
when a subpopulation is rare, while saturation is only important when a subpopulation is 
common. Thus we use the stochastic branching process model, ignoring saturation effects, to 
describe the dynamics of a subpopulation while it is small. Conversely, when it is large, we 
ignore random drift and treat it with the correctly saturating deterministic equations. Our 
use of both deterministic and stochastic analyses requires an appropriate way of linking the 
two together. In this paper, we will describe a method for doing so. This method accounts 
for all of the important aspects of genetic drift and is simple and intuitive. It should be of 
broad applicability to related evolutionary problems. 

This approach works as long as the stochastic regime and the saturation regime are 
different. That is, a subpopulation must become large enough to neglect random drift before 
it is too large to ignore saturation. We can treat a subpopulation of size n deterministically 
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so long as ns > 1. On the other hand, saturation can be ignored when n <C N. Thus in 
order to separate the stochastic and the saturating phases of growth of a subpopulation, we 
require Ns ^> 1. Throughout this paper, we will assume this condition holds. 

A situation in which there are multiple subpopulations of varying sizes is illustrated in 
Fig. 4: this shows the logarithm of a typical fitness distribution within a steadily evolving 
population. Where the subpopulations are small, at the front of the distribution, stochastic 
analysis is necessary but nonlinearities can be ignored. When a subpopulation represents a 
substantial fraction of the total, nonlinear saturation is important but stochasticity is not. 
As long as Ns ^> 1, there is an intermediate regime where neither matters. We can thus 
use a nonlinear deterministic analysis in the bulk of the distribution, a linear stochastic 
analysis near the front, and match the two in the intermediate regime in which both are 
valid. These approximations are fully controlled and any corrections to our results will be 
small for Ns ^> 1. 

Our analysis is inapplicable when Ns ^ 1, i.e. for small populations or those experiencing 
very weak selection. However, unless s is extremely small (s ~ £4), a population small 
enough that Ns ^5 1 will usually be too small for clonal interference or multiple mutation 
effects to matter. Thus requiring Ns ^> 1 is not a serious limitation to exploring the effects 
we are concerned with here. 

Relationship of our Model to Wright-Fisher Model 

The deterministic limit of our model is identical to that of the Wright-Fisher model. 
However, the stochastic dynamics are slightly different. In the Wright-Fisher model, all 
individuals have a lifetime of exactly one generation, while in our model individuals have a 
random exponentially distributed lifetime with mean one generation. In the Wright-Fisher 
model, the distribution of the number of offspring per individual is approximately Poisson, 
while in our model the number of offspring is geometrically distributed. Both the mean 
lifetime and mean number of offspring per individual are identical in the two models (hence 
identical deterministic dynamics), but the different distributions do lead to slight differences. 
In particular, although the probability a beneficial mutation of size s (s < 1) will become 
established is proportional to s in both models, it is ~ cs with the coefficient c = 2 the 
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Wright-Fisher model and c = 1 in ours. Since it is likely that the population dynamics in 
any real population is not well represented by either of these models, there is no one "cor- 
rect" model (e.g. for populations dividing by binary fission, as in many experimental studies 
of evolution, the fixation probability is closer to 2.8s ( I Johnson and Gerrishu2002| )). For- 
tunately, in our analysis of the behavior of large populations, these differences only cause 
negligible corrections in the arguments of logarithms (e.g. replacing ln(iVs) with \n(cNs) 
when Ns ^> 1). For smaller populations, however, the speed of evolution is proportional 
to the probability of establishment and thus does depend on more details of the model: in 
particular, the successional mutation result for the speed is v ~ cNUbS 2 . 

It would in principle be possible to use a diffusion approximation to the Wright-Fisher 
model instead of our branching process model. This would have the advantage of being able 
to handle saturation and drift at the same time, and thus cases where Ns ^ 1. Such a 
model could in principle treat all the different subpopulations stochastically, including all 
mutations between these populations. However, this would lead to a complex and difficult to 
analyze infinite-dimensional diffusion process. There is, however, a controlled approximation 
- valid for large Ns — to the full diffusion process that is exactly equivalent to ours; as it 
would add little, we will not discuss this explicitly here 



ANALYSIS 



This section contains the primary analysis presented in the present paper: the accumu- 
lation of beneficial mutations in the simple model described above. We begin by looking 
at what happens to a single mutant individual. We then ask what happens to a mutant 
population which is being fed constantly by new mutations. We next couple this analysis 
to the behavior of the rest of the population to gain an understanding of the evolution of 
large asexual populations and obtain our primary results. Finally, we connect this behavior 
to the small population regime. 



The Fate of a Single Mutant Individual 



We begin by considering the fate of a single mutant individual. We assume that in a 
large clonal population of size N, at time t — there is a single mutant individual with a 
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beneficial mutation conferring fitness advantage s. We denote the size of the subpopulation 
carrying this beneficial mutation at time t as n(t) (by assumption, n(0) = 1). We study the 
effects of selection and drift on this population by calculating the probability distribution of 
future n(t), Prob [n; t], assuming that no further mutations occur. This provides an essential 
building block for all the subsequent analysis, and also illustrates the basic approach in a 
simple context. 

Throughout this analysis, we assume that the number of individuals with the beneficial 
mutation is small relative to the total population size, n <C N . Thus the mutants do not 
significantly affect the overall fitness of the population, and hence do not interfere with 
one another. Naturally, if the mutant becomes established it will supplant the wild-type 
population and this condition will cease to be true. By this time, however, the mutant 
subpopulation will be large enough that we can switch from the stochastic analysis described 
here to a correctly saturating deterministic analysis. 

Because the mutant subpopulation is too small to affect the mean fitness, mutant indi- 
viduals have a birth rate 1 + s and death rate 1. We define g(n,no,t) to be the probability 
of having n descendants at time t, starting from no descendants at t — 0. We are interested 
in calculating g(n, l,t). The probability of a birth or death event in a unit of time dt is 
(2 + s)dt, and this event is a birth with probability ^ and a death with probability 
This means that 

g(n,l,t) = — j— (2 + s)dtS nfi + (2 + s)dtg(n, 2, t-dt) + [l-(2 + s)dt]g(n, l,t-dt), (7) 

where 5 n>0 — 1 if n — and is otherwise. Assuming that individual lineages are independent 
(valid since n <^ N), 

n 

9(n, 2, t) = 9(&, 1, t)g(n - a, 1, t). (8) 
Using this fact and defining the generating function 



G(z,t) = Y,g(n,l,t)z n , (9) 

dG(z,t) 
dt 

with initial condition G(z, 0) = z corresponding to the one individual initially. We solve 



n=0 

we find 

f)CJy i\ 

= ! + (! + s)[G(z,t)} 2 - (2 + s)G(z,t), (10) 
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this differential equation for G(z,t), finding 

[ ' } ~ (z-l)(l-(l + s)e*t) + zs- [ } 

We can now determine Prob [n; t] = g(n, l,t) from G(z, t). A standard inversion yields 



Prob [n; t] 



'(1 + s)e st - 1 - s 



[(1 + s)e st - 1] [(1 + s)e st [ (1 + s)e st - 1 

valid for n > 0, and 



(12) 



Prob [„ =0 ,«] = (l+ y_ l . (13) 

We are interested primarily in understanding the distribution of n given that the mutant 
population is not destined to go extinct. This is given approximately by 



A(n, t) = Prob [n; t|not extinct] y^— ; — ^— exp 



sn 



l + se st -l 



(14) 



;i + s)e st -1-s 

Here we have approximated the geometric factor by a simpler exponential in n which is valid 
for f > i, the regime of primary interest. Note, however, that although the crucial features 
are more apparent in the approximate expression, all the results below follow from the exact 
equations. 

At this stage, the above results merely reproduce classical analysis, but it is useful to 
pause to compare them with various intuitive predictions. We first compute the average 
number of mutant individuals at time t, 

(n) = e st , (15) 

which confirms our understanding of what it means to have a beneficial mutation with 
advantage s. However, most of the time the mutation will die out. Conditional on not going 
extinct, 

e st - 1 

(nlnot extinct) = 1 H , (16) 

s 

which is larger at long times by a factor of 1/s. At short times, t -, this is 
(n|not extinct) ~ 1+t. At long times, t the extinction probability becomes yr^ ~ l — s, 
and (n|not extinct) = \e st . Note that short times corresponds to n C ^, while long times 
mean n» (Note also that none of these expressions saturate as n approaches N; they 
are valid for n N, as discussed above.) 
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It is useful to ignore mutations that are destined to go extinct due to drift, and focus 
only on those that are destined to become established. We do this for the remainder of this 
section; all results are thus implicitly conditional on non-extinction. However, some care is 
required. If a mutation occurs at time t = 0, and survives drift to become established, it 
may seem that on average it will grow as n(t) = e st , because it started from one individual 
at t = and grows on average exponentially. However, this is incorrect. Given that it 
survived drift, it is likely to have grown faster than e st in the early stochastic phase of 
its growth during which drift is faster than selection (IBartoni . Il995l ). This is apparent 
from the expressions above: for t <C -, (n|not extinct) « 1 + 1 which is much faster than 
(n) = e st ~ 1 + st. Once the population is large, and stochastic effects can be neglected, 
it naturally grows as e st . However, because it grew faster than this in the early stochastic 
phase, it will on average be larger than if it had grown this fast through its entire history. 
As is clear from the expression for the average n at long times, (n|not extinct) ~ \e st , the 
behavior can be crudely approximated by assuming that it started at size - (rather than size 
1) at t — 0, and then grew exponentially as e st thereafter. This approximation is of course 
not valid during the early phase of growth. However, as we will later be concerned primarily 
with relatively large subpopulations, this is a simple way to take into account the stochastic 
effects. Note that the above also implies that, given that a mutation is not destined to go 
extinct due to drift, it will fix in a time of order Mn[iVs], not -\nN, as is sometimes seen 
in the literature. For s ~ 0.01, this is a difference of about 500 generations. To be more 
precise, the fixation time is a random variable with a distribution of width - and mean close 
to - In [iVs], rather than the naive - In N. [For small s, this implies that the variation in the 
fixation time (~ -) is small compared to the difference between the mean fixation time and 
the naive result.] 

For much of the subsequent analysis, we will be concerned with the size of a subpop- 
ulation only after it is big enough to be essentially deterministic. This is because when 
subpopulations are rare enough to be stochastic, they are too small to influence the mean 
fitness of the population and, as we shall see, are also small enough that the chance of 
mutations occurring within these populations is negligible (the former assumption requires 
Ns ^> 1, while the latter we explore in Appendix G). However, as the above discussion 
makes clear, the stochastic phase of growth affects the later deterministic dynamics. Thus 
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we are interested in "summing up" the stochastic effects in terms of their impact on later 
deterministic growth. We want to do better than merely understanding how stochasticity 
affects (n), as described above; we want to understand the full effect of stochasticity in 
determining the distribution of n(t) in its deterministic growth phase. 

Focusing only on the effects of stochasticity on later deterministic dynamics allows us to 
make a key simplification. Once the subpopulation is large enough to grow deterministically, 
but still small enough that saturation can be ignored (i.e. ^ <C n <C N), its dynamics can 
be described by n = ue st . The value of v is a random variable that depends on how fast the 
population grew in its stochastic phase. However, the only effect of this stochasticity on the 
later deterministic growth is to create random variation in v. As almost all this stochasticity 
accumulates at short times, at large t (after the population has become deterministic) we can 
describe the overall effects of stochasticity in terms of a probability distribution Prob [is] . 
This is a big simplification, because the full probability distribution conditioned on non- 
extinction, A(n, t), depends on both n and t, while for large t Prob [u] is independent of t, as 
we show below. This simplification is possible because at large t the only time-dependence 
is the deterministic exponential growth. 

We can justify the above heuristic argument rigorously. The definition of v is just a 
transformation of n, v = ne~ st . This is valid in the early stochastic phase of growth as well 
as in the later deterministic phase. However, in the stochastic phase we do not expect that v 
will be independent of t. As we have the probability distribution A(n, t), it is straightforward 
to transform this to the distribution Prob \v\ t] . When we take the large-t limit of Prob [u; t] , 
it becomes independent of t. This justifies our expectation that at large t, we have Prob [u], 
independent of time. 

Rather than using the probability distribution of u, it will prove useful to define a related 
variable r by 

n = - e < l - T \ (17) 
s 

The random variable r is simply related to v. r = — - ha(vs). Since r is a simple transforma- 
tion of n, we can immediately calculate Prob [r e (r, r + dr); i|not extinct] = prob [r; t] dr 
(with prob [r] the probability density as we are treating r as a continuous variable) from 
A(n,t); again, for the remainder of this section, all distributions are conditional on non- 
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prob [t; ilnot extinct] = — exp 

(1 + s)e st — 1 



s(t-r) 



(18) 



extinction. We find 

e s(t-r) 

(1 + s)e st - 1 

As with z/, this describes the distribution of n both in the deterministic and stochastic phase. 
Since n depends on t, so does the distribution of r. However, as expected from the previous 
discussion, the distribution of r becomes independent of t for large t. We define r est as 
r(t — > oo), and find 



s 

B(T est ) = prob [r est |not extinct] = exp 

.L ~~ r S 



-ST est 



(19) 



The average value (as well as higher moments) of r est can be easily computed from this 
distribution. We have 

1 , f e 7 1 7 



(Teat) = - In 



S 



l + s 



(20) 



where 7 is Euler's constant 7 = 0.577216. 

The variable r est has an intuitive interpretation: r est is the time at which n would have 
reached size - had it always grown deterministically at rate s, as calculated by looking 
at n(t) at large t and extrapolating backward. This is illustrated in Fig. 5a. We can 
therefore approximate the destined-to-be-established subpopulation as drifting randomly 
for a time r est , at which time it reaches size ^, and then grows deterministically thereafter. 
With this simplification, the only important stochasticity is the duration of the drift period. 
This is the key simplification which allows us to smoothly connect the branching process 
with the nonlinear dynamics once the subpopulation is no longer rare. It jibes with our 
intuitive expectation that the subpopulation is dominated by drift when rarer than ^, and 
then behaves deterministically once it exceeds this size. Note, however, that in addition 
to telling us nothing about n(t) before time r est , it also gives a slightly inaccurate picture 
immediately after r est when n{t) is around -. The time r est is not in fact the time at 
which the subpopulation reaches size - (see Fig. 5a). Rather, it is the time at which n(t) 
would have reached size ^ if we assumed that it always behaved deterministically, but it 
gets the large-t behavior right. In fact, some small drift does take place after reaching size 
i; our approximation doesn't ignore this drift, but rather adds up all the drift that takes 
place through all the time and rolls it into a change in r est . This can thus be thought of 
as the time at which the mutation establishes. In asking how quickly beneficial mutations 
accumulate, this is the most natural variable. 
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The caveats above illustrate why it is perfectly consistent to have r est < 0; the distribution 
B(r es t) above shows that this is not even particularly improbable. This reflects the fact that, 
given that a mutant subpopulation is not going to go extinct, it is reasonably likely to grow 
remarkably fast in the early stochastic phase. A r est < simply indicates that the mutant 
subpopulation grew so fast when rare that if we look at the subpopulation size much later 
and assume it always grew exponentially at rate s, the subpopulation would have had a size 
larger than ^ at t — 0. 

We note that (r est ) — ^ In 
on non-extinction). This may naively seem inconsistent, since n(t) = ±e s( - t ~ Test ) for large 
t. However, it merely reflects the fact that (e x ) ^ e^ x \ The difference between these two 
averages is in fact the essential reason that r est will prove to be such a useful variable to 
focus on. This is because the value of (n(t)) depends much more sensitively on the tails of 
Prob [n;t] than does (r est ). 

Mutants Generated by a Changing Population 

The above analysis of the population size of a clone founded by a single mutant individual 
is an important building block. However, it does not address the full problem. We must 
now ask how the mutants arise in the first place. In the simplest case, we might imagine 
a wild-type population of size N, starting with mutants at time t — 0. This population 
generates mutants at rate NU^. Each mutant follows the dynamics given in the above 
section, beginning at the time it was created, but now we have multiple such initial mutants 
which are created at random times. 

Generally, the relevant process is even more complex. Starting from a wild-type popula- 
tion, a single-mutant subpopulation is generated, experiences a stochastic period, and then 
begins to grow deterministically. Then double-mutants are created by mutation within the 
single- mutant population while it is still growing (i.e. before it fixes). The rate at which 
these double-mutants are generated increases with time because the single-mutant subpop- 
ulation is growing. Later, the double-mutants may themselves generate mutants before they 
fix (and possibly before the single- mutants fix), and so on. 

We therefore must tackle a more general problem: the distribution of the population size 



, while (n(t)) = \e st for large t (as always, conditional 
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n(t) of a mutant clone which starts with individuals and is "fed" by mutants from a less-fit 
clone of (growing) size f(t). If this less-fit clone is small enough that its growth is stochastic, 
calculating the probability distribution of the mutant population is extremely complex. 
Fortunately, most non- viral organisms live in parameter regimes where a subpopulation will 
never generate mutants destined to establish while it is still so small that it must be treated 
stochastically; we discuss this in Appendix G. Thus we take f(t) to be some deterministic 
function describing the growth of the subpopulation from which mutants arise. Later we will 
set the origin of time in f(t) stochastically, to reflect the stochasticity in the establishment 
of this feeding population. 

Note that we no longer need to condition on the mutant subpopulation not being destined 
to go extinct. Since this subpopulation is being continuously fed with new mutations, even- 
tually one of these mutations will survive drift. Thus at long times the mutant subpopulation 
will never be extinct. 

Unlike in the previous section, the growth rate of the stochastic mutant population n(t) 
is not necessarily 1 + s. Rather, for a population with a total of y mutations, the growth 
rate is 1 + (y — y)s, where ys is the fitness of the subpopulation n(t) and ys is the mean 
fitness of the population. For convenience, we write this as 1 + rs. The death rate of this 
population is still 1. Since y increases continuously, r is time-dependent. Despite this, 
we approximate r as a constant. This is justified because we will only use the stochastic 
description of n(t) during the brief period during which it is rare, and in this time r does 
not change significantly. We discuss this approximation in Appendix H. 

We define r](t — tk) to be the number of descendants at time t of a single mutant which 
occurred at time That is, given that a mutation occurs from the "feeding" population 
to the population n at time tk, rj(t — tk) is the number of descendants of this mutation at 
a later time t. Note that rj is the random variable whose generating function is given by 
G(z,t — tk) above (Eq. ( TTTT) ) . but with s replaced by rs. We have 



where M is the random number of individual mutations that have occurred and Tk are the 
random times at which they occurred. 

The number of mutations and their timings are an inhomogeneous Poisson process, fed 



M 




(21) 



fc=i 
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by the population f(t). We therefore have 

Prob [M = m}= exp [- /* U b f{t')dt' 

L J —oo 



mi 



(22) 



Note the lower limit of integration here represents the earliest time that mutations are 
allowed to occur; we have chosen this to be infinitely early. Although this includes f(t) 
which are zero for t < 0, we discuss this choice of cutoff more generally in Appendix E. 
The timings of the mutations T k , conditional on M = m, are the ordered statistics of m 
independent identically distributed samples drawn from the distribution 



F(x) 



X(x) 



X < t 



A(t) 
1 X > t 



X(x) = f U b f(y)dy. 

J —oo 



(23) 



This means that the joint distribution of the T k conditional on m is given by 

f(U) 



prob [ii, . . . t m \M(t) =m}=m\ f[ 



t=\f-oof(x)dx 



(24) 



The generating function for the distribution of the number of mutant individuals, n(t), is 
given by H(z,t) = (z n ^). Conditioning on the distributions of M and the T k given above, 
and using the fact that G{z,t — t k ) = (z 11 ^^) , we find that 

rt 



H(z, t) = exp 



U b I [G(z,t-t')-l}f(t')dt' 

J -co 



(25) 



To understand the full probability distribution of n(t), we simply have to plug in the appro- 
priate form f(t) and then invert this generating function. 



An Exponentially Growing Population Feeding Another 

In large populations, there will typically be various multiple mutants present, as described 
in the introduction and illustrated in Fig. 3. We can now apply the results of the previous 
section to this situation. As before, we define the the most-fit subpopulation that is large 
enough to treat deterministically to have fitness (q — l)s above the mean fitness (note that q 
is not necessarily an integer). This subpopulation, n q -i, grows exponentially at rate (q — l)s. 
We define the origin of time such that n q -i(t) is given by 

n ff _x(f) = -e^K (26) 
qs 
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Note that, analogous to the previous section, we are approximating q as constant — we 



discuss this further below. The reason for defining the origin of time such that n q -i(t) 



qs 



at t — will become clear below. We now want to understand the stochastic dynamics of 
the subpopulation a fitness qs above the mean (denote this population size by n q (t)). The 
subpopulation n q ^\ feeds mutations to n q ; we therefore have f(t) = n q ^i(t) in the notation 
of the previous section. 

This problem involves one exponentially growing population, n q _i, feeding another, n q . 
In analyzing it, we first step back from our specific situation to study the general case of 
an exponentially growing population with with size N\ = v\e Rlt feeding mutants at rate U b 
to a stochastic population N 2 which on average grows exponentially with rate R 2 . We later 
will substitute v = R± = (q — l)s, and R 2 = qs. We begin by plugging f(t) = v\t Rlt 
into Eq. (1251) . using the obvious generalization of G(z,t) to a population that grows at rate 
R 2 . This gives us H (z, t), the generating function of the probability distribution of N 2 . It is 
convenient at this point to pass from generating functions to Laplace transforms by defining 
the transform variable ( = 1 — z. For our purposes we can assume that ( is small: this 
introduces errors into Prob [N 2 ;t} when N 2 ~ 1, but we will never use Prob [N 2 ;t] in this 
regime. We find 

£ e R 2 (t-v) e Riv dv 



H(£,t)=exp 



-U b uR 2 
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Substituting u = (e R2 ^ v \ we find 



#(C,t) = exp 



-U h v (e 



R2t 



R1/R2 



u- Rl / R2 du 
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Assuming that ( is small, the integral in this expression is independent of ( and is given by 
(^) Rl/R2 ncsc [tt(1 - Ri/Ra)]. We find 

R1/R2 



#(C,t) = exp 



irU b v (e R2t 



R Rl/R2 sin [tt(1 



R1/R2)} 

We can now substitute our values of V\, Ri, and R 2 to find that in our case 

nU b (C,e qst f~ 1,q 



(29) 



H(( } t) = exp 



qs^qs) 1 1 / q sin(7i/q) 



(30) 
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This is the standard form for the Laplace transform of a one-sided Levy distribution, a well- 
studied special function. An integral representation of this is the inverse Laplace transform 
of if, 

P(n q , t) = Prob K; t] = ^- j e n ^H(C, t)d(, (31) 

where the integral is over the imaginary axis. For large n q this can be integrated to give 
P{n q ) ~ 2 } 1/q . [Note this distribution has infinite (n q ), an unimportant and unbiological 

Tlq 

artifact of our choice of cutoff in the integral for H(z,t); this is discussed in Appendix E 
below.] 

To understand this distribution P(n q ,t), we define a variable r q similar to that described 
in the section above on the fate of a single mutant. We first define 

n q (t) = -e" 8 ^. (32) 

As before, r is time dependent, but for t — > oo the distribution of r is independent of t. We 
define r q = r(t — > oo). As before, r q is the time at which the subpopulation n q (t) would have 
reached size ^ had it always grown deterministically at rate qs, as calculated by looking at 
the size n q (t) at large t and extrapolating backwards. Unlike r est , the value of r q includes 
both the time for the mutation (or mutations) to arise in the first place as well as time for 
their initial stochastic growth. This is illustrated in Fig. 5b. 

As in the section on the fate of a single mutant, we can think of the mutant subpopulation 
as drifting randomly for a time r q , at which point it reaches size ^ and thereafter grows 
deterministically. We therefore sometimes refer to r q as the "establishment" time. As before, 
this is somewhat inaccurate in describing the dynamics right around r q (or before) when the 
population is around or below a size of — g . Again r q is not actually the time the population 
reaches size — . Now this is because both because future random drift and future feeding 

qs ° 

mutations, after the population reaches size ^, are included in the estimate of r q . However, 



for the purposes of understanding the dynamics of the mutant population once it becomes 
large compared to ^, it is valid to think of r q as the time it takes the population to reach 
size — . 

qs 

We often wish to use moments of r q . These are straightforward to calculate in principle, 
but somewhat tricky in practice. We first note that because of the definition 

n q (t) ee -e^\ (33) 
qs 
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we have 



t+ - In 

s 



--lnK(t)]. 



(34) 



We can therefore calculate (r) by computing (In n q ) and plugging into this expression. Higher 
moments of r q are easily computed by similar expressions; these depend also on higher 
moments of hnn q . We can calculate (ln m n q t) by noting that (\n m n q ) 
Using the integral representation of P(n q ,t), we have 
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where the ( integral is over the imaginary axis and we have defined 



a = 



(q-l)st 



s(q — 1) sin(7r/g) (qs) 1 l / q 
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We integrate this to find 



q sin(7Tyu) T(l + /i) _m£. 

g-isin(jM)r(i + ^) a "' 



(37) 



where T(x) is the Gamma function. 

We can now calculate derivatives of this with respect to /j, to get (\n m n q ) and hence the 
moments of r. For large t, as expected, r becomes independent of t. For the mean of r q , we 
find 

s (q — 1) sin(n/q) 



(q-l)s 
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where 7 = 0.5772 is Euler's constant. The variance of r q is given by 



Var(r 9 ) 
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Higher moments are also simple to compute if desired (and demonstrate that there is sub- 
stantial skew in the distribution of r q , as r q substantially smaller than (r q ) can occasionally 
occur, while r q substantially larger than this almost never do — this is important in un- 
derstanding the fluctuations in the rate of adaptation around its steady state value, and is 
discussed in Appendix D). 

This calculation of (r q ) is somewhat involved because of the need to use the integral 
representation of P(n q ,t). We can get rough estimates (often useful in other contexts) via a 
simpler method. Namely, we define a "typical" population size by defining n q = ^, where 
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Ci is defined by H((i,t) = e _1 . As is apparent from the definition of a Laplace transform, 
H(C,t) = J e~ ( ' nq P(n q ,t), for well-behaved distributions this typical value h q is roughly like 
the median of n q . We can then get a typical value f q from this using the relationship between 
lnn g and r q . Doing this leads to a f q that is very close to the (r q ) calculated above. 

Note that the careful result for (r q ) is similar to the crude calculation in the heuristic 
analysis section above, which approximated the time required for a new mutation to arise 
at the nose as /I* U(,n q -i(t)qs = 1, roughly the typical time at which the first mutant 
destined to establish arises. Note that this expression is only weakly dependent on the 
lower cutoff to the integral, which is good since n q -i(t) is not given accurately by the 
deterministic approximation in this regime. This weak dependence appears for the same 
reasons in the careful calculation of r q and is discussed in more detail in Appendix E. The 
crude and careful results do differ, however. The careful result accounts properly for the 
randomness in the timing of a new mutation and the fluctuations during its early drift 
phase. It also accounts for the fact that not only the first mutant destined to establish at 
the nose contributes. Rather, as we will see later, of order q different mutations contribute 
significantly to the establishment of a new most-fit subpopulation at the nose. For later 
considerations it is important to note that this means that there is significant diversity even 
among the individuals which have the same fitness. 

The Rate of Evolution and Maintenance of Variation at Large N 

We are now in a position to calculate the rate of evolution and amount of variation 
maintained in large populations. In the above calculations, we set t = to be the time at 
which the population n q -\ reached size — g . This corresponds to the establishment time of this 
population. After a (stochastic) time T q , the next more-fit subpopulation, n q , establishes. 
For the later deterministic dynamics of n q , we can think of this as the time when n q reached 
size ^. At this point, we have reached the identical situation where we started, but with the 
nose of the population fitness distribution moved forward by s. In the steady state, the mean 
fitness of the population must also have moved forward by s in the average establishment 
time {T q ). Thus the population at n q now has fitness only (q — l)s ahead of the mean. It 
has size but thereafter grows exponentially only at rate (q — l)s, giving a population size 
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± e (q-l)st 
qs 

The process now repeats itself — we can take this establishment time of the new popula- 
tion (n q , above) as the new t — 0, and after that this population grows as we had described 
for the original population n g _i. In fact, it now is the population n q _i, since the mean 
fitness has increased by s. Thus we can see that the mean fitness of the population and the 
position of the nose move forward by s in a time (r q ). Thus the average rate of increase in 
fitness in the population is 

" = KY (40) 

Note that this discussion makes clear why, for consistency, we defined the establishment 
time for n q _i to be when this population reached size ^, not ^ q l^ s ■ We also note that 
the population which we had originally called n q ^\ is now n g _ 2 and its size is given by 

i e (<?-l)sT9 e 0?- 2 ) S * 

qs 

This change in the growth rate of the population we had originally called n q -\ raises 
an important point. We defined n q -i(t) = f(t) = -^e^ -1 ^*, and used this expression in 
calculating P(n q , t), particularly for large t. Yet at this large t, our expression for f(t) is not 
accurate, because the mean has shifted and the population with original (relative) fitness 
(q — l)s is no longer growing exponentially at rate (q — l)s. Fortunately, the mutations that 
occur after the establishment of n q (when the expression f(t) becomes inaccurate) do not 
greatly impact its later population size, n q (t). In other words, the mutations that dominate 
the population n q happen early while n g _i is still accurately given by f(t). Yet one must also 
ask whether these mutations happen too early when f(t) is also not a good approximation 
for n q -i(t) (because the definition of r q , which we used to define f(t), includes mutations and 
stochastic behavior that happen later). Fortunately, the mutations that matter from 
to n q do occur late enough that n q ^i is accurately described by f(t). This can be checked 
by studying the behavior of r(t); we discuss this and related subtle issues in Appendix G. 

When q is too small, the approximations above are no longer justified. Whenever q < 2, 
the growth rate of the subpopulation n g _i slows substantially during the period while the 
important mutations to n q are occurring. That is, n q -i saturates while n q is becoming 
established. Thus our analysis in this section is only valid for q > 2. As we will see, 
this corresponds to large N. We discuss the q < 2 case in the next section. However, 
it is the large-iV, q > 2 result that we are most interested in — this is where there are 
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typically many multiple mutations at once, and the behavior differs dramatically from the 
successional-mutations regime. 

Throughout this section, we have asserted a steady state in which the mean fitness in- 
creases at the same rate as new mutations are established, and have defined the lead in 
steady state to be qs. Yet we have not discussed the balance between mutation and se- 
lection which sets this steady state. In a very small population, mutations are few and 
far between, and fixation times are relatively short. Thus a single-mutant population at s 
will establish and fix before another mutation becomes established. This is thus q = 1. In 
a somewhat larger population, a single-mutant population at fitness s will again establish 
and begin to grow. However, since the population is larger, it takes longer to fix. Before it 
does so, another mutation occurs in this single-mutant population, creating a double-mutant 
population at 2s. If the population is only moderately large, the single-mutant will replace 
the wild-type before a triple-mutant can arise. The process will then repeat; this is q = 2. 
As N continues to increase, we expect that q will also rise. 

The relationship between q and iV can be obtained from r q . As we have seen, immediately 
after the subpopulation at q becomes established, its size is — . The subpopulation at q — 1 



qs 

has size ^e^ 9 ~ 1 ^ STq , the subpopulation at q — 2 has size ■ g ^^ q ~ 1 ^ STq e^ q ~ 2 ^ STq , and so on. All of 
the subpopulations must add up to size N; in practice the total is dominated by one or a 
few (compared to q) subpopulations. Applying the fixed total populations condition (and 
assuming that all the r q are on average (r q )), we find 

This is a transcendental equation for q, but because of the logarithmic dependence on q 
on the right hand side it is easily solved by iteration. For most purposes, even the zeroth 
approximation, 

Q « ~ TTT^j (42) 



In 

is sufficiently accurate. To get higher accuracy one can plug this into the right hand side of 
Eq. dSD. 

As expected, the value of q increases with N, and also increases with Ub because when 
mutations happen more quickly there are more of them in the population at once. The 
dependence on s is more complicated, because increasing s both decreases the fixation time 
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(leaving less time for additional mutations to occur) and increases the rate of mutations 
that establish (because it increases the establishment probability). Note that q is of order 
-^\n[Nqs] (the basic selection timescale) divided by — ln(qs/Uf,) (the basic mutation estab- 
lishment timescale). This makes sense, as q is primarily what is determined by the balance 
between these two forces. 

With the value of q determined self-consistently above (Eq. (HT!) ). the mean fitness shifts 
by s in exactly the time (r q ). Thus the corresponding distribution of the subpopulations is 
indeed a steady state. By plugging Eq. (HTj) into the expression for (r q ) and substituting 
this into v = ^ry, we can obtain the speed of evolution. Doing this using the lowest-order 
result in the iterative expansion for q (Eq. (H2l)). we find that the speed of evolution is 
roughly 
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valid provided q is reasonably large (basically, when 2 In [iVs] is larger than In ^-). If a more 
accurate result is needed, we can simply carry the iterative expansion for q to higher order. 

The calculations above confirm the intuitive picture and results described in the heuristic 
analysis section above. The speed of evolution is determined by two mostly independent 
factors. One factor is the dynamics of the nose — the feeding process from n q -\ to n q 
which sets r q . This process depends directly only on Ub and s; the only impact of N here 
is via its effect on the lead qs. The other factor is the dynamics of the already established 
populations. This is dominated by selection, and hence depends directly only on N and s; 
the only role of mutation here is its role in setting q. 

Our result is consistent with the fundamental theorem of natural selection, which states 
that the speed of evolution is equal to the variance of fitness in the population — provided 
mutation is negligible. To see this, we first note that the bulk of the fitness distribution is 
Gaussian. This is because a population with t more mutations than the mean grows as e est , 
and the mean shifts by 1 during every time interval of r q . This means that at the end of 
an interval, the number of individuals with i mutations more than the mean is determined 
by its cumulative growth over all these time intervals: exp(— ksr q ) ~ e -( ST >j/ 2 )(^+ 1 / 2 ) 2 ) 
a Gaussian distribution. We call the variance of this fitness distribution a 2 . The number 
of individuals that differ from the mean by ks is then roughly iVaexpf— (ks) 2 /2a 2 ], and 
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the fittest established population — with k rs q — will have of order individuals. We 
therefore expect qs ~ ay2 IniVs. This means that if the fundamental theorem for natural 
selection holds, we expect v = a 2 ~ 2in(Ar s ) " ^ n( ^ m deed, some algebra verifies that this 
yields the expression for v in Eq. ( 1431 ). 

The fundamental theorem of natural selection should apply whenever mutation can be 
neglected compared to selection. Since this is true in the bulk (i.e., away from the nose) of 
the fitness distribution, the correspondence between our result and the theorem is reassuring. 
The speed of evolution is equal to the variance in fitness, as usual. Thus the crucial aspect 
of our calculation of the speed of evolution can be viewed as an analysis of how much 
variance in fitness a population maintains by mutations. That is, we have determined how 
a population maintains variance in fitness when it continually generates variation due to 
mutations while at the same time this variation is being selected on. As we noted earlier, 
however, neither the heuristic arguments nor the analysis involve the variance. To compare 
with the fundamental theorem we had to extract the variance from the analysis of the full 
fitness distribution, but this was not necessary for obtaining the primary results. This is 
because the lead proves to be a more useful measure of the width of the fitness distribution, 
because it is the lead that is directly affected by new mutations at the nose. The variance 
is of course also increased by mutations, but only as a consequence of the dynamics of the 
lead and only after the new mutant populations have grown to substantial numbers. The 
key fact that the distribution is close to gaussian out almost to the nose, which is many 
standard deviations above the mean, is indicative of the little significance of the region near 
the mean that controls the variance. 

Evolution at Moderate N 

In addition to the evolution at large N, we want to understand the crossover between 
small- N and large- N behavior. In this subsection, we explore this crossover. 

For very small N, the the successional-mutations regime obtains. In the heuristic analysis 
section, we noted that mutations take about N ^ generations to establish in this regime, 
and then fix in a much shorter time. Thus evolution is mutation-limited, and we have 
v ~ NUbS 2 . It is instructive to redo this calculation using the machinery we developed for 
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the large- N case. To do this, we must replace the exponential form for f(t). As before, we 
take the establishment time of the mutation at (q — l)s to be t = 0. Of course, here q — 1 
so (q — l)s = 0. In this regime, each mutant fixes soon after becoming established. For the 
purposes of the next establishment, we can therefore approximate the population at (q — l)s 
by 

n q . 1 (t) = f(t) = Ne(t), (44) 

where 9(t) — 1 for t > and otherwise. We substitute this form of f(t) into H and 
integrate, and take the inverse Laplace transform of the result to obtain 

prob [n] = T ^ Ub) exp [-NU b sn - e~ ST1 ] . (45) 

This gives (ti) jj^, so the velocity v « NUb-s 2 , as expected. 

We now turn to the intermediate regime. For NUf, comparable to — p — r or larger, the 
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fixation time is not short compared to the establishment time. Thus we cannot use f(t) = 
N9(t). At the same time, the establishment time is not so short compared to the fixation 
time that saturation in the feeding population is unimportant (the large- iV case we have 
focused on thus far). We therefore need to consider the case of a growing and saturating 
population feeding another. We assume that the single-mutant always fixes before the triple- 
mutant population establishes, so that we only have to consider two deterministic and one 
stochastic clone in the population (i.e. q between 1 and 2). The dynamics of the single 
mutant population a time t after it establishes are given by 

Ne st 

'<*> = n7T^ (46) 

Note that f(t) initially grows as e^ q ^ st , with q = 2, but later slows to e^ q '^ st with q' = 1 
(i.e. it becomes approximately constant). The crossover occurs over a time interval of order 
1/s, which is much smaller than the establishment times and is thus effectively a sharp 
transition. The behavior of the feeding population is thus roughly equivalent to having q 
between 1 and 2. The stochastic population that it feeds initially grows at rate qs with 
q = 2. The establishment of this stochastic population occurs at a time r 2 when, roughly, 

T2 U b f(t)dt = -, (47) 
o s 
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with c of order unity. This yields 

1 



r 2 « - 

s 



In A^ + In (e c/wv - l)] . (48) 



A more careful analysis (analogous to the earlier calculations of r q ) that takes into account 
the distribution of r 2 yields a result that is the same as the above simple argument but 
with a factor of order unity inside In As, which is a small correction over the whole range 
of validity. While in general c will depend on the detailed birth and death processes, and 
the speed of evolution in the successional mutations regime will be proportional to c, for the 
dynamics we have analyzed throughout, c = 1. We use this below. For NU b < 1, we obtain 
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which crosses smoothly — and simply! - over from the successional mutations behavior 



for Ns < 
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to V = 
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which is just the result we obtain for q — 2. When NUb 



becomes of order unity, from the above expression we have r 2 s ~ lnA^s + 0(1). 

For NUb 1 the behavior is well into the multiple mutations regime we analyzed earlier, 
and the results obtained for general non-integer q > 2 apply. The two sets of results match 
together for Ns s/U b , up to order-unity factors inside logarithms of Ns and of s/U b . An 
example of the crossover between the two regimes is shown in Fig. 6a. 



TRANSIENT BEHAVIOR 



So far, our analysis has assumed that the mutation-selection balance has already been 
reached. If a population starts with an arbitrary distribution of fitnesses, it will gradually 
approach the steady state distribution. In this section, we describe this transient behavior. 
We focus on the case where the population is initially monoclonal. Other starting fitness 
distributions can be analyzed using similar methods. We consider the large-A" concurrent 
mutations regime (in the successional-mutations regime the monoclonal population is already 
essentially in steady state). 

Starting from a monoclonal population, we can calculate the dynamics of the single- 
mutant subpopulation that arises by using the small-A" results above, since here too the 
feeding population is f(t) = N9(t). It would now be tempting to assume that this single- 
mutant population just grows exponentially at rate s after first becoming established. We 
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could then immediately import our previous results for the establishment time of the double- 
mutant population r 2 , triple-mutant population r 3 , and so on. We could then assume that 
all these populations establish in order until the q th population, at which point the steady 
state would be reached. 

Unfortunately, this is wrong, for two reasons. First, the single-mutant population grows 
faster than exponentially at rate s because it is receiving mutations from the still-large wild- 
type population. Because of this, the double-mutant population establishes more quickly 
than the steady state r 2 , and then itself grows faster than exponentially with rate 2s because 
it is receiving more mutants from the fast-growing single-mutant population. This then 
affects the triple-mutants, and so on. The second complication is that the mean fitness does 
not stay at the wild-type value until the q th mutation has established, so it takes more than 
q establishments to reach steady state. 

Rather than attempt to find a closed- form analytical result, we discuss here an algorithmic 
solution to the transient dynamics. We proceed in steps. First, we calculate the lead from 
the current fitness distribution. Based on this, we calculate the next establishment time 
(interpolating if the lead changes during this period because of an increase in the mean 
fitness). We then calculate the new fitness distribution and the new lead, and repeat the 
process. 

When calculating the establishment times, we must remember that the feeding popula- 
tions are not necessarily growing as simple exponentials. Earlier we used the establishment 
time t p to approximate the population size of n p as n p (t) = ^e ps (* _Tp ), a simple exponen- 
tial. We noted that this is inaccurate while n v is around — , because it includes both future 
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mutations from n p _i to n p and future stochasticity. Since we have used this form of n p (t) 
to calculate the establishment time of the next more-fit subpopulation, this approximation 
for n p (t) must be accurate by the time the mutations which lead to the subsequent estab- 
lishment occur. In the steady-state case, this holds, as shown in Appendix G. However, for 
the transient dynamics it is not always correct. 

This problem is most serious for the single-mutant population, which we consider now. 
The wild-type population has roughly constant size N during the period when the single- 
mutant population is rare. This means that the single-mutant population grows on average 
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as 

ni (t) = ^ [e st - l] . (50) 
This reaches size - after a time of order wr - generations. However, the inferred establish- 

s NUbS ' 

ment time (by extrapolating backwards) is T\ = — - In [NUb] generations. This is sub- 
stantially negative because mutations that occur well after the population reaches size 
- contribute significantly to n\. The approximation we used before would be to take 
Hi — ^,e s<yt ~ Tl ^ = ^ji-e st in calculating the establishment time of the double- mutant pop- 
ulation r 2 . But using the correct form of n 1; we find that the first double- mutants occur 
roughly at time t — \ \a 1 + 7J- j^j- . Thus when NUb -C 777 — corresponding to q < 4 - 



double mutants do not occur until our usual approximation for ri\ becomes reasonable. We 
can therefore use our previous calculation of the establishment time r 2 from the steady-state 
analysis above. All future establishment times (i.e. r 3 for the triple-mutants, etc.) can 
similarly be imported directly from the steady state calculations. However, when NUb ~ 77^ 
(q > 4), we must use the correct form of n x to calculate r 2 and n 2 . In this case, n 2 will also 
grow faster than our usual approximation n 2 = ^e 2s ^~ T2 ^ would predict. We must there- 
fore repeat this procedure to consider whether it is reasonable to calculate T3 based on our 
usual approximation, or whether we need to use the more complex form for n 2 . However, 
this effect is much weaker than for ni, it only matters if NUb is much larger than in the 
previous condition. If it does matter, we must again ask if the more complex form for 
will be important in calculating T4; this will only matter if NUb is larger yet. The number 
of establishments for which we have to take this subtlety into account therefore depends 
on = \ia(Ns)/ ln(s/m): the larger the steady state q, the more transient establishments 
we must consider. In practice, in comparing with previous experiments we have found that 



considering the complex form of ri\ in calculating r 2 is sometimes nec essary, but al 
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establishments can be calculated using the steady-state large- N results (IDesai et al 
because in these experimental situations q is never much larger than 4. 

A second subtlety in the above algorithmic approach is the way in which the mean 
fitness changes; it does not increase in evenly spaced steps of size s as it would in steady 
state. For example, the double-mutant subpopulation can become established soon after 
the single-mutant subpopulation does. Then, as it grows twice as fast, it will outcompete 
the single-mutant subpopulation while both are still rare. We call such an event a "jump," 
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since it will lead to a jump in the mean fitness by 2s when the double-mutants become the 
dominant subpopulation. Of course, it is also possible that the triple-mutants will jump 
past the double-mutants, or that the double-mutants will jump the singles, and then the 
quadruple-mutants will jump the triples, etc. These effects can lead to complex dynamics of 
the mean fitness during the transient time before the steady state is established. However, 
given the establishment times of the various populations, the time dependence of the mean 
fitness is straightforward to calculate from the deterministic dynamics of the competing 
subpopulations that are growing exponentially. 

Putting all these effects together, we can construct an algorithmic solution for the tran- 
sient dynamics. We calculate the first establishment time, and note at what time this new 
subpopulation will change the mean fitness. We then calculate the next establishment time, 
and again the implied future effects on mean fitness (modifying previous such results if jump- 
ing events will occur). We continue to repeat this process. When the mean fitness changes, 
we note how this changes the lead and adjust the establishment times appropriately. We 
iterate this process until the steady-state lead, qs, is reached. Even after that there can be 
some lingering effects of the transient, as the rest of the fitness distribution may not yet have 
reached the steady-state gaussian profile. Yet soon thereafter the steady-state behavior is 
indeed reached. 

Rather than using this algorithmic approach, it is also possible to use a deterministic 
approximation for the transient behavior. Starting from a monoclonal population, the tim- 
ing of the first few establishments are given accurately by a deterministic approximation. 
However, this typically cannot give us the full transient dynamics, because stochastic effects 
at the nose become important once the fitness distribution grows to a substantial width, 
which usually occurs before the transient regime is over. This deterministic approach is also 
less versatile, as is only valid for some starting distributions. 

The transient behavior can be quite important. During the transient phase, the accumu- 
lation of beneficial mutations proceeds more slowly than in the steady state, because after 
the first few establishments, but before the steady state is reached, the lead will be ps with 
establishment interval approximately t p < r q (since p < q). Thus a clonal population will 
accumulate beneficial mutations slowly at first, before the rate of accumulation gradually 
increases to its steady-state rate. This slower transient phase lasts a substantial time - 
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longer than it takes to accumulate q mutations once the steady state has been established, 
again because r p < r q for p < q (and, as noted above, in fact it can take more than q 
establishments to reach the steady state). 

DELETERIOUS MUTATIONS 

Our simplest model neglects deleterious mutations. But deleterious mutations can alter 
the dependence of v on the mutation rate (and on N), because increasing Uf, typically comes 
at the cost of also increasing the deleterious mutation rate. This has proved an importan t 



consideration in clonal interference analyses (IJohnson and Barton! |2002| ; IOrrI . [2000) 



In this section, we consider qualitatively and semi-quantitatively various effects of different 
sized deleterious mutations in the simple model in which all the beneficial mutations have 
the same s. The effects of deleterious mutations of size s in this model have been studied 



by 



Rquzine et al. 



(120031 ). Here we discuss briefly the effects of deleterious mutations of 
various sizes, but leave detailed analysis for future work. 

It is useful to separate the effects of deleterious mutations into their impact on the 
dynamics of the bulk of the distribution (and hence the mean fitness) and their effects 
on the establishment of new most-fit clones at the nose. In the bulk of the distribution, 
deleterious mutations come to a deterministic mutation-selection balance which alters the 
shape of the fitness distribution and reduces the mean fitness. This effect actually speeds 
up the evolution: if the deleterious mutations had no effect at the nose, their impact in 
reducing the mean fitness would increase the lead and thus make new establishments at 
the front occur faster. But deleterious mutations at the nose have the opposite effect: they 
slow down the growth of the most-fit populations and decrease the fitness of some of these 
individuals, reducing the rate at which new more-fit individuals establish. 

In understanding these effects, it is useful to consider large-effect and small-effect delete- 
rious mutations separately. First we consider deleterious mutations whose cost Sd is larger 
than s. When a deleterious mutation with Sd ^ s occurs at the nose, that individual is no 
longer at the nose. Thus the deleterious mutations just reduce the effective growth rate at 
the nose. If Uf is the mutation rate to deleterious mutations with s ^ Sd, then the growth 
rates of subpopulations at the nose are simply reduced by Uf . The effect of deleterious 
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mutations on the mean fitness is also simple, because the mean fitness of the population 
is dominated by the largest subpopulation (which is exponentially larger than all others). 
Thus in considering the effect of the deleterious mutations on the mean fitness, we can focus 
on their impact in this subpopulation. This remains the largest subpopulation for roughly 
| generations, which for these values of Sd is larger than j-. Thus it comes to a deleterious 
mutation-selection balance while it is largest, since this balance is obtained in (—, — = — =) 
generations. This means that the deleterious mutations reduce the mean fitness by (up 
to small corrections due to the dynamics and the other subpopulations). This reduction in 
the mean fitness effectively increases the lead by , which increases the growth rates at the 
nose by the same amount. This cancels the effect of the deleterious mutations at the nose. 
Thus deleterious mutations with Sd ^ s have very little net effect on v: they do not change 
the rate of new establishments at the nose, up to the small corrections noted above. This 
is not surprising — the deleterious mutants are all doomed, so roughly speaking their effect 
is simply to reduce the effective fitness of all individuals equally, which has no net effect on 
v. But they do increase the lead qs, which changes the shape of the fitness distribution and 
can slow down the speed somewhat. 

For weakly deleterious mutations with <C s, which occur at mutation rate Uf, the 
effects are more complicated. In this case, the fact that an individual at the nose has a 
deleterious mutation does not make it substantially less likely to be the source of a new nose- 
extending mutation. Thus the effective growth rates at the nose are unaffected by deleterious 
mutations. However, some nose-extending mutations will occur in individuals with one or 
more deleterious mutations, and hence will not necessarily extend the nose by s. Instead, 
they will sometimes have an effect s — Sd, or s — 2sd, or less. We can estimate the strength of 
this effect by using a deterministic approximation for the deleterious mutation accumulation 

<C 1 (or, roughly, when — -C 1), we find that on average, nose- 
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extending mutations are burdened by a deleterious load of ( - <? ^ 1 - )s2 In 
the deleterious mutations at the nose is to reduce the effective s by the amount , d ,s In , 

(?-!> lUb 

which is small compared to s. This will tend to slow the evolution. An analogous calculation 
applies when >> 1; here the deleterious mutations have a larger effect, but still only 
produce an average fitness cost at most of order Sd- The effect of the deleterious mutations 
on the bulk of the distribution is again to reduce the mean fitness of the population. The 
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amount of this reduction, however, depends on the accumulation of deleterious mutations 
throughout the fitness distribution, not just in the most-fit subpopulation as before, because 
^ < -j-. Still, this tends to reduce the mean fitness by an amount of order Uf. This again 
speeds the evolution, and partially cancels the slowing effect at the nose. Thus deleterious 
mutations with Sd <C s affect v by increasing the effective lead by Uf and reducing the 

(when <C 1) or by of order Sd (when is larger). 
These effects are all small. 

Small effect deleterious mutations will also slow down the evolution via the accumulation 
of them during the collective-sweep time, qs/v ~ ln(s/L/j>)/s, in which a subpopulation grows 
from being the lead population to the dominant population. We expect this effect to be 
largest relative to the effects of these deleterious mutations on the dominant subpopulations 
when 1/ Sd is of order the collective-sweep time. This effect reduces the speed by an amount 
proportional to Ud- 

To analyze in more detail the quantitative effects of deleterious mutations (even in the 
simplest single-beneficial-s model) is beyond the scope of this paper. Note in particular that 
the analysis in this section is invalid when the deleterious mutation rate is large enough 
that the deterministic approximation for their behavior at the nose becomes incorrect. In 
this regime — on the border between Muller's ratchet and adaptive evolution — a more 
careful analysis is needed. We leave this discussion, which is essential for understanding the 
dependence of the rate of evolution on the mutation rate when mutation rates become large, 
for future work. 

SIMULATIONS 

Our analysis involves a number of approximations. While we have analyzed their validity 
above and in the appendices, we also used computer simulations to test our results. In this 
section, we describe these simulations and the comparisons to our results. 

We started our computer simulations with a clonal population with a birth and death rate 
of 1, and a mutation rate of U b . We arbitrarily defined this population to have fitness 0. We 
divided time into small increments. At each increment, we first calculated the average fitness 
y, and then produced births, deaths, and mutations with the appropriate probabilities. The 
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birth rate of individuals at fitness y was set to be 1 + (y — y) (with y — y always small 
compared to unity), their death rate 1, and the mutation rate Ub. We then repeated this 
process to simulate the population dynamics, providing a full stochastic simulation of the 
simplest constant-s, beneficial-only model analyzed above. We recorded the mean fitness 
and lead as a function of time and, for each set of parameters, measured the average v and 
q once past the initial transient regime. 

We carried out these simulations at a variety of different parameter values. The match 
between simulations and our theoretical results was good, provided the conditions for the 
validity of the concurrent mutations regime obtained. Examples of these comparisons are 
shown in Figs. 6 and 7. In Fig. 6, we show the theoretical predictions for the average speed 
of adaptation (using both the lowest order iterative result for v presented in Eq. (j4"5j) and 
a higher order iterative expansion) compared to simulation results as a function of N, Ub, 
and s. In Fig. 7, we show similar comparisons for the average lead q (again using the lowest 
order iterative result for our theoretical predictions). The agreement is good in both cases, 
though our theory slightly underestimates both v and q. This may be due to the effects of 
fluctuations in r q (described in Appendix D) slightly increasing the mean v and q because of 
their non-linear effects, or to other factors arising from \n(s/Ub) not being sufficiently large 
for the asymptotic results to obtain to this accuracy. 

DISTRIBUTIONS OF s, AND RELATIONSHIP TO CLONAL INTERFERENCE 
ANALYSES 

The simple model we have analyzed assumes that all beneficial mutations confer the 
same advantage s. But in most natural situations different beneficial mutations will have 
different fitness effects. This does not change the basic dynamics of adaptation in large 
asexual populations: many beneficial mutations still occur before earlier ones have fixed and 
these can interfere with each other's fixation (Fig. lb). And the successful mutant lineages 
are likely to have had multiple beneficial mutations before they fix. But because mutations 
in different lineages cannot recombine together, many will be wasted when other lineages 
outcompete them. 

There are two reasons beneficial mutations are wasted. We have focused on the wasting 
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of mutations because they occur in individuals who are not very fit (i.e. away from the 
nose) and are therefore handicapped by their poor genetic background. But when beneficial 
mutations have a variety of different effects, there is another way they can be wasted: small 
effect mutations can be outcompeted by larger mutations that occur in the same or similar 
genetic background. We refer to this latter process as "clonal interference." As before, we 
use the term "clonal interference" to refer to this first effect only (despite earlier broader 
definitions), consistent with the focus of recent work on the subject. This can only occur 
when not all mutations have the same fitness increment, and is thus absent in the simple 
constant-s model. 
Rece nt work by 



Gerrish and Lenski ( 



2004 



Gerrish. 



Kim and Stephan 



2001 



2003 



1998) and o th ers (ICampos and De Oliveira . 



Johnson and Barton! 



Orr 



200( 



(J WilkeI . 



2002 



Kim and Orr 



2005 : 



20041 ) has taken the opposite approach to 



the multiple constant-s mutations approximation and focused instead on the effects of 
clonal interference, while ignoring multiple mutations. In this section, we first summarize 
the conclusions of such analyses, which assume all mutations occur on the same genetic 
background. We then consider the effects of including both clonal interference and multiple 
mutations. As we will argue, whenever the former plays a significant role, so does the latter. 

The now- conventional clonal interference analysis considers how small effect mutations 
can be outcompeted by larger mutations. Specifically, if a mutation A with fitness sa 
becomes established, one considers the probability that another mutation B, with effect 
sb > sa, will also become established before mutation A has fixed. If this happens, mutation 
B can drive A to extinction and mutation A is thus wasted. Of course, it is also possible 
that mutation B is subsequently outcompeted by a still fitter mutation C, and so on. The 
key approximation is that the largest mutation which occurs and is not outcompeted by 
a still larger one fixes, becomes the new "wild-type" - - i.e. the majority population - 
and the process then repeats. Additional mutations that might occur in a lineage which 
already has mutation A, B, or C are ignored. For any fixed population size, there is some 
selective advantage, s C i, such that sufficiently large mutations, those with s > s C i, are rare 
enough that they are unlikely to occur before some less fit mutation arises and fixes. In 
the conventional clonal interference analysis, it is assumed that a mutation of size around 
s C i will thereby fix before any others, and the process will then repeat. This is equivalent 
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to successional-mutation behavior with a set of mutations each with the same strength, s c «. 
Since s C i increases with the population size, more mutations are wasted in larger populations, 
implying that v increases less than linearly with NUb- 

Before discussing the problems with the basic successional-fixation assumption, we con- 
sider how the characteristic s C j depends on N and on the distribution of selective advantages, 
p(s)ds. Because only beneficial mutations with substantial s matter for large N, the total 
Ub itself is not important. It is more convenient to use the mutation rate per generation for 
mutations in a range ds about s: 

p(s)ds = rate of mutations in interval (s, s + ds) = e~ A ^ds . (51) 

We assume that large-effect beneficial mutations are typically much less common than small- 
effect ones, so that /i(s) is small and decreases rapidly with s. Since /i(s) = p(s)/Ub is 
dimensionless, it is convenient to define 

(52) 

which thus increases with s. We will see that A(s) roughly plays the role that m(s/[/&) does 
in the single-s case; the equivalent condition to s > (/ 6 is that A(s) 1 (at least for the 
important range of s). 

The basic clonal interference analysis is simple: in the time that a mutation of size sa 
will take to fix, tf ix — In [ATs^], some mutation of larger size s will have time to occur 
and become established as long as the total mutation rate for mutations larger than sa is 
sufficiently large: 

sp(s)ds > —. (53) 

Since /i(s) decreases rapidly with s, this will not happen when sa ~ Sd, where 

A(s cl ) w ln{Ns ci ) . (54) 

That is, s ci (N) is the value of s at which in the whole population there is of order one 
mutation per generation: as /i(s) oc Ub, s ci depends only on the product NUb, with the 
functional form determined by p(s). In the clonal interference analysis approximation, the 
speed of evolution is assumed to be the size of these mutations s ci times the rate at which 
they become established. This yields 

vci ~ CN [fi(s ci )s ci \ s 2 cl » CNs 3 a e' A ^ ~ [s a (N)} 2 , (55) 

48 



A(s) = In 



p(s) 



where C is a factor of order unity which is not really obtainable from clonal interference 
analysis, as it depends on the details of further approximations 1 . At this point we should 
note that various potential improvements are possible. In particular, it is not at all clear why 
the establishment time rather than fixation time should be used to obtain the accumulation 
rate of the s c « mutations. As we shall see below, if the latter rather ad hoc assumption is 
made instead, the clonal interference analysis gives closer to the correct results for certain 
distributions: those with long tails in p{s). 

The above clonal interference analysis makes a crucial approximation which is essentially 
never valid: that double-mutants can be ignored even when mutations are common enough 
that they often interfere. This is manifest in the assumption that the important mutations 
only occur in the majority ("wild-type") population. The basic problem is that even if a 
more-fit mutation B occurs before an earlier but less-fit mutation A fixes, A may still survive. 
An individual with A can get another mutation D such that the A-D double mutant is fitter 
than B. If this happens, mutation A (along with D) can fix after all. Indeed, such events 
should be expected: any population large enough for clonal interference to matter is also 
large enough for double mutants to routinely appear even for s ~ s c «. This is because clonal 
interference can only affect the fixation of a mutation of size s when the mutation rate to 
mutations stronger than s is large enough that NUj^ s > 1. But when this occurs, the total 
beneficial mutations per generation, NUf, ^> ln(s 1 / Ub > ) ■ Thus, from our analysis of the single-s 
model, whenever clonal interference occurs, multiple mutations also play a role. 

The single-s model, in contrast, is unrealistic because it explicitly excludes competition 
between mutations of different effects. Thus the conclusions from this model and the clonal 
interference analysis are each only part of the story. In the remainder of this section, 
we outline the behavior for more general distributions of beneficial mutations, taking into 
account both clonal interference and multiple mutations. Fortunately, as we shall see, for 
many forms of fJ>(s), the single-s approximation can implicitly account surprisingly well for 
the effects of clonal interference. Detailed analysis will be published elsewhere. 

Let us first consider starting from a clonal population (although this is an oversimplifica- 

1 Note that the details of how we define fixation does not make much difference in the clonal interference 
result. We have also ignored other factors inside logarithms, since La > 1, but this will affect the 
coefficient C 
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tion which misses important aspects of the dynamics; see below). Depending on N and Ub, 
various different mutants will arise, as well as double-mutants, etc. One of these will be the 
fittest mutant that is established in the wild-type population before any other mutation or 
combination-of-mutations fixes. All the other mutations that have already occurred will be 
driven to extinction and thus do not matter for the long-term evolution. For a given N, Uj,, 
and p(s), there is a typical fitness effect (call this s) of the beneficial mutations that create 
- singly or in combination — this fittest mutant. We call mutations of roughly this magni- 
tude predominant mutations, and define — crudely at this point — C/j, as the mutation rate 
to these mutations. Clonal-interference-like competition determines the predominant range 
of mutations. Unfortunately, however, we cannot simply lift the definition of s from clonal 
interference theory. Except at very short times, the population will not be mono-clonal but 
will include various single and multiple mutants with a distribution of overall fitnesses. This 
means that s is determined by a delicate balance between clonal interference and multiple 
mutation effects. Given an s, however, the predominant mutations accumulate via a process 
similar to that described by our analysis of the constant-s model, with population size N 
and the effective parameters s = s, and Ub = Ub- 

Why should there be a predominant range of s? The basic argument is simple. Mu- 
tations significantly smaller than s occur frequently. But, by definition, these mutations 
are routinely outcompeted by predominant mutants. Thus these mutations do not interfere 
with the accumulation of the predominant mutants. In contrast, larger-than-s mutations do 
interfere with others when they occur. But, by definition, these must be rare enough that 
it is unlikely that such a mutation will arise in the time it takes a predominant mutant - 
or a combination of predominant mutations — to fix (else the larger mutation would be the 
predominant mutant). Thus the population will primarily evolve via the accumulation of 
mutations with s in some range around s. Our previous analysis does not predict s, but 
given a value of s it determines how these mutations accumulate (see below for more details). 
This is a slight oversimplification, as mutations of both smaller-effect and larger-effect than 
s will play some role. These considerations affect the appropriate definition of s, and the 
range of s around s that is important. 

What we must now address is the crucial fact that s (and Ub) depend on N and Ub- 
As we increase N or Ub, more mutations occur before others fix: this suggests s will 
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thereby increase. Clonal interference analyses consider part of this process, and predict 
that the analog of s (s r j) increases slowly with both N and Ub (indeed, with their prod- 



uct) (IGerrish and Lenskj . Il998l ; IWilkeI . I2004J ). But these approximations over-suppress 



smaller mutations by ignoring multiple mutations, which are more likely to involve the com- 
mon smaller mutations. Thus we expect that s should increase even more slowly with N 
and Ub than clonal interference models suggest. Nevertheless, even a slow increase in s could 
be important, since in the single-s model, v increases with s 2 but only increases slowly with 
N and Ub- As we now show, the form of p(s) qualitatively affects the behavior. 

In the extreme case in which p(s) decreases very slowly with s (p(s) ~ \ or slower), the 
largest mutation that can typically occur and establish in a given time always dominates 
the cumulative evolution up until that time. Thus a predominant s does not even exist 
and neither our analysis nor clonal interference describes the dynamics: it is controlled 
by successional fixations - - but with no steady-state speed — no matter how large the 
population. We do not discuss this seemingly unlikely situation further. 

Whenever p(s) falls off faster than 4r, the basic single-s behavior obtains, with a narrow 
range of s (roughly a factor of two or less) around some predominant s, with the effective 
mutation rate Ub crudely being that for mutations in this range. But even though one could 
then simply plug the appropriate s and Ub into our earlier expressions for the speed, v, the 
single-s forms for the dependence on N and Ub may not be accurate, because s and Ub 
themselves depend on N and Ub- There are two possibilities. The first is s and Ub depend 
weakly enough on N and Ub that our expressions are roughly accurate. Another possibility 
is that the evolution is dominated by larger and larger mutations as the population size 
increase, as found in the clonal interference analysis. Again mutations in some restricted 
range will control the behavior (and some degree of multiple- mutations will still be involved), 
but s will increase markedly with N. We shall see that both these behaviors can occur, 
depending on the form of the distribution of mutations p(s)ds. 



Predominant s approximation 

A simple approximation that might be expected to be valid if a. sufficiently narrow range 
of s dominates is to ignore all the mutations except those in some narrow range about s, 
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compute the evolution speed, v(s) from the single-s analysis, and then maximize this over 
s to obtain the predominant s, s, and an approximation for the actual speed: 

v ~ maxt>(s) = v(s) , (56) 

which defines s. We call this approximation the predominant s approximation, as it ignores 
the question of how wide a range of s is important. We can then make a conservative check 
of our assumption that a narrow range of s dominates by computing how quickly v(s) falls 
off away from s, because mutations at other s cannot increase the actual velocity by more 
than their v(s). 

For concreteness, we consider a class of distributions /i(s) parameterized by three quan- 
tities: a characteristic selective advantage, a, an (effective) overall mutation rate [/& ~ ae~ e 
- so that £ is like the ln(C4/cr) that appears in the single-s results — and a parameter (3 
that characterizes the shape of the distribution of rare large mutations. We thus write 

l i{s)=e- t -^\ (57) 

For convenience we use the shorthand notation 

L = \n(Na) . (58) 

We will see that the behavior depends qualitatively on whether (3 is larger or smaller 
than 1. For f3 > 1, the distribution falls off faster than exponentially, and we refer to this as 
a "short-tailed" fi(s). The exponential case is exactly marginal. For f3 < 1, the distribution 
falls off more slowly than exponentially. We refer to this as the "long-tailed" /i(s) case. 



Short-tailed fi(s) 

We begin by considering the case of f3 > 1; that is, a distribution that falls off at 
least exponentially. The behavior is simplest when the population size is large enough 
that 2L/A(s) is substantially greater than unity. This is loosely like q(s), the number of 
mutations by which the nose is ahead of the mean fitness, being substantially larger than 
unity (although 21n[iVcr]/A differs from the actual q in important ways). In this regime we 
have 

«W « s 2 ^p (59) 
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which is maximum at s = s given by 
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The predominant s approximation is only valid in this regime for (3 > 1. It yields 



s = 



a 



(3-1 



(61) 



and thus 



v « v(s) = CpCT 



2 2\n[Na] 



(62) 



£2-2//3 



with the coefficient C p = (f3- l) 2 ~ 2 /0//3 2 . Note that s is roughly independent of A" in this 
regime, but decreases as the overall beneficial mutation rate increases (i.e. as t decreases). 
In other words, s does not depend strongly on N, but does decrease as Ub increases. This 
makes sense: as Ub grows, multiple small mutations become more important compared to 
single larger mutations. Because of this, the dependence of v on N is very similar to our 
single-s approximation, but the dependence on the mutation rate is weaker. We can check 
for consistency of the use of the large-g single- s results. The value of q is 2 In [Na] /A (s), 
which yields 



This is large when 2L/£ is large, unless /? — 1 is small — i.e. the tail is becoming long. 

The behavior for (3 > 1 can also be analyzed when L is not so large. As L decreases, the 
predominant s decreases — i.e. it begins to depend on N. The resulting expressions are 
more complicated, but can be computed from the more general form for v(s), Eq. (H3I) . in 
a similar way. However, they are of questionable validity, since only some of the significant 
s will be in the multiple mutation regime, while others will be in the crossover regime of 
q < 2. As we have seen in a previous section, this crossover is complicated even for the 
single-s model; it will be even more so with a distribution of s. 

This brings us to the issue of how wide a range of s plays a substantial role in determining 
v (as well as the steady state shape of the moving fitness distribution). In the successional 
mutations regime, the speed is v ~ A^ / s 2 fi(s)ds, so that s of order a dominates (as long as 
fi(s) falls off faster than 1/s 3 ). That is, a range of s within a factor of two or so of the typical 
value a dominates the evolution. In the multiple mutations regime, the maximization of the 




(63) 
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single-s speed v(s) over s gives a predominant s much larger than a, but no direct information 
on the range of s that contribute. A natural estimate is the range over which v(s) is not lower 
than v(s) by more than, say, a factor of two. This would mean that the width of the range 
is comparable to s itself: that is, mutations with effects between s/2 and 2s matter. This 
confirms our assertion that the single-s model gives at least a good qualitative picture of the 
dynamics even when there is a short-tailed distribution of the effects of beneficial mutations. 
Since all the important mutations are of order s, "leapfrogging" (by which, for example, a 
double-mutant gets a mutation which makes it more fit than an existing quintuple-mutant) 
does not have a large effect on the evolution. We can thus indeed consider the basic dynamics 
to be the accumulation of mutations of roughly size s according to the single-s description 
given above. 

However, our calculation of the range of s that matters calls into question the predominant 
s approximation: why should the actual v be v(s) rather than, e.g., v(s) averaged (or some 
other weighted integration) over s? A more sophisticated analysis, which will be described 
elsewhere, shows that for short-tailed distributions {j3 > 1), both s and v are given correctly 
by the predominant s approximation in the large Lji limit — up to only differing factors 
inside logarithms and other small corrections. But the range of s that significantly affects v is 
much smaller than that guessed from the predominant s approximation. This should perhaps 
not be surprising, as the predominant s approximation assumes that all s contribute to v as if 
different-sized mutations did not interfere. But interference will in fact tend to suppress the 
contribution to v from s away from s. We find that for short-tailed distributions, in fact only 
s — s of order s/ \fi are important. In terms of the mutation rate Ub to mutations with s ~ s, 
this range has width 5/ \J\n(s/Ub). That this difference does not invalidate the predominant 
s approximation result for v can be understood by considering the weak dependence of v on 
the mutation rate in the single-s model. As v depends only logarithmically on Ub, replacing 
Uf, by an effective Ub that includes either a substantial range around s or by one that includes 
only a narrow range will only alter factors inside the logarithms and thus have little effect on 
the inferred v. Since the fuller analysis finds that an even narrower range around s matters, 
it strengthens our contention that there is a predominant s (albeit one that depends on Ub) 
and that the full dynamics is very similar to that of the single-s case analyzed in detail in 
this paper. The exception to this is the intermediate iV regime in which the crossover from 
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successive to multiple mutations occurs and the effective q is less than two or so: we will 
not discuss this complicated crossover regime further here, although it may be relevant in 
many experimental situations. 

We have seen that the predominant s approximation does well for the primary quantities 
of interest, s and v, although it overestimates the range of s that plays a role. In contrast, the 
clonal-interference-only analysis yields the incorrect behavior for short-tailed distributions. 
For the model distributions, A(s) = £ + (s/o") /3 , the clonal interference analysis yields 



For the short-tail case, this is much larger than the predominant value, s. Indeed it is 
qualitatively wrong: s ci increases with increasing Ub, while s decreases. Using s C i instead of 
s leads to incorrect predictions of v; in particular, clonal interference predicts v grows only 
sublinearly with In TV. This problem stems from the fact that clonal interference analyses 
have the wrong basic picture of the dynamics. The evolution is not in fact dominated by 
the rare very large mutations that occur only once per generation in the full population, as 
the clonal interference approximation implicitly assumes. Rather, the evolution is actually 
controlled by multiple mutations of smaller (though still larger than average) fitness that 
occur frequently even in the much smaller sub-populations that exist in the nose of the fitness 
distribution of the steady state evolving population. Because the multiple mutation effects 
depend on there being sufficiently large rates for the predominant mutations, increasing the 
overall mutation rate allows multiple smaller mutations to beat larger ones. Thus increasing 
Ub results in decreasing s — in contrast to the increase of s ci with Ub- 

Long-tailed /x(s) 

For distributions that fall off more slowly than a simple exponential — i.e. (3 < 1 — the 
behavior is rather different. This is apparent even in the crude predominant-s approximation. 
Again, we begin by considering the simpler large 2L/£ limit. But with the long-tailed /i, we 
need the fuller single-s expression: 



(L-i) 1 ^. 



(64) 



v(s) 



2 2£ - A(s) 
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with A(s) = i + (s/a) 13 . In the large 2L/£ limit the predominant s is found to be 

"4L(l-/3)" 



S ~ 0" 



2-/3 

with corresponding effective mutation rate 



/i(s) OC 

This yields 



I 

N 



4(l-/3)/(2-/3) 



(66) 



(67) 



v « A a 2 (2L)^~ 1 , (68) 

with coefficient = /3(2 — 2/3) 2///3 ~ 2 (2 — Z?) 1 " 2 /' 3 . In this case, we see that i> grows faster 
than linearly with In N. Surprisingly, the dependence on the mutation rate in this regime is 
negligible: Ub only determines how large TV has to be to be in this regime. The smaller the 
mutation rate, the larger the N needed. But in contrast to the short-tail case, here 

^|f| (69) 

is not large, so that even for very large N, the important multiple mutants still involve only 
(9(1) of the predominant mutations. The fact that q never becomes particularly large for 
long-tailed n(s) is because in this case § increases substantially with N: in the short-tailed 
case, many small mutations contribute, while in the long-tailed case, fewer larger mutations 
are involved. But we must be careful with the above results for the long-tailed case, as they 
are not valid if the inferred q is less than two: below this the crossover from successional to 
concurrent mutations behavior will apply. We need to distinguish two cases. 

If f3 > |, q > 2 and the above results apply. The corresponding effective mutation rate 
decreases with a power oil/N less than unity, so that the total mutation supply rate for the 
predominant mutations (JV//(s)) grows with iV as N^~ 2 ^^ 2 ~^ . (Of course, many of these 
are wasted as multiple mutants outcompete the single mutants and control the dynamics, 
as described by our single- s theory). 

If P < |, then the above analysis would give q < 2 and Nfx(s) <^ 1, which indicates a 

breakdown of the approximations. In this case q sticks at two, and the dynamics is basically 

successional, with the predominant mutants being those for which the total rate iV/x(s) ~ 4 

i 

This means that s ?a oh? , and we expect 



s 



t a 2 LTi- L . (70) 
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Note that the coefficient coincides with the earlier expression at (3 — 2/3. The steady state is 
at the upper end of the crossover between the successional and multiple mutational behavior 
as discussed in the section on evolution at moderate N . 

For f3 < |, the clonal interference-only approximation agrees with the predominant s 
approximation, as the total mutation rate to the predominant mutants is of order unity so 
that s ~ s ci . In contrast, for the intermediate case with | < (5 < 1, clonal interference 
analysis yields s ci « aL 1 ^. This is still the correct behavior, but the numerical coefficient 
is wrong: as noted above, the total mutation rate for the predominant mutants grows as a 
power of N, in contrast to the clonal interference approximation in which it assumed to be 
independent of N. For the speed of evolution, naive application of the clonal interference 
analysis gives v ~ ~ L 2 /^, which is not even the correct scaling with L. But if, instead, 
the fixation rate rather than the establishment rate is used to give an improved (though it 
is not a priori clear why this should improve the result) clonal interference estimate of v, 
the correct scaling with L can be obtained. 

At this point, it is not clear how good the predominant s approximation is for the long 
tailed distributions, nor how wide a range of s around its predominant value are important. 
A more sophisticated analysis is needed for this, as well as for understanding the crossover 
from the successional NUb < 1 to the large q regime analyzed above. 

A simple example 

A concrete (albeit artificial) example is useful to illustrate the points made above. We 
consider a simple model with three classes of mutations, each with a single s: weak mutations 
with a small s s , intermediate ones with a medium s m and strong mutations with a large 
sf, each class has its own mutation rate. Specifically, we consider s s = 10~ 3 , s rn = 10~ 2 , 
and = 10" 1 , with mutation rates U s = 9 x 10" 6 , U m = 4 x 10" 6 , and XJ X = 5 x KT 10 , 
crudely approximating an exponential distribution of beneficial mutations (with, in terms 
of the family of distributions discussed above: f3 — 1, a — 10 x 1CT 2 , <«7). 

For small population sizes, the successional regime obtains and 

v » N[U s s 2 s + U m s 2 m + U lS 2 } (71) 
which is dominated by the medium mutations. As N increases, we expect multiple mutations 
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to start to play a role when NU m ~ 1/ \n(s m /U m ) 1/8, corresponding to crossover out of 
the successional fixations regime for N ~ 3 x 10 4 . 

To understand the behavior for larger N, we first analyze the three types of mutations 
separately, similar in spirit to the predominant s approximation. That is, we consider three 
sub-models, each of which have only one of the three types of mutation. The corresponding 
rates of evolution, v s , v m , and vi must all be less than v to t, that of the full model, because 
the full model has more beneficial mutations than any of the three sub-models. Conversely, 
we expect v tot < v s + v m + v i because, at best, the different mutations can accumulate 
independently; in practice, they will tend to interfere (although multiple mutants with 
combinations of the different types can matter and contribute to the actual speed). Each 
of the three sub-models has only one type of mutation, so our single-s results can be used 
directly to obtain v s , v m , and v\. 

For a population of size N = 10 5 — just into the multiple mutations regime — we find 
v s = 3.5 x 1CT 7 , v m = 1.5 x 10~ 5 , and v i = 5 x 1CT 7 . The leads of the corresponding fitness 
distributions — the number of multiple mutants above the mean that exist at one time — are 
q s = 2.7, q m = 2.2, and qi = 1. Thus the small and medium mutations accumulate primarily 
as double and triple mutants, while the large mutations (alone) would be in the successional- 
mutations regime. For this moderate size population, the mutations with effect s m are the 
predominant mutants. They clearly dominate the full model, since v tot will be in the very 
narrow range between v m and v m + v s + vi. Although the small mutations are common, they 
do not matter because even triple-small mutants — as occur in the small-only model - 
will be routinely outcompeted by single medium-mutations. The medium mutations occur 
frequently in the fixation time of the triple-small mutants and thus routinely "leapfrog" 
them. The small mutants never interfere with medium mutations, and those that fix do so 
only because they happen to be linked to medium mutants. The large mutations, in contrast, 
do interfere with the medium mutations, but occur so rarely that they are not important for 
the overall evolution rate. In this example, a few hundred medium mutations fix for each 
large mutation that establishes, so almost all medium mutations fix without being affected 
by a large mutation. Thus the accumulation of mutations is very well approximated by the 
process our single-s analysis describes, provided we choose s = s m and C/j, = U rn . 

As the population size is increased, v\ will increase faster than v m or v s because it is not 
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yet in the regime with logarithmic iV-dependence. For N = 10 6 , v s = 5x 10~ 7 , v m = 2x 10~ 5 , 
and vi = 5 x 10~ 6 . The medium mutations still predominate, but less strongly than before. 
By N = 10 7 , we have v s = 6 x 10~ 7 , v m = 3 x 10~ 5 , and v i = 5 x 10~ 5 , so the large mutations 
begin to dominate. For larger N, they will do so even more strongly. This shows how s 
increases with N. With this discrete p(s), s changes quite rapidly in a small range of In TV, 
but for a continuous fitness distribution the increase will be smooth (of course, continuous 
distributions present additional complications involving the proper weighting of mutations 
near s). 

We could also apply clonal interference analysis to this three-class model. From these 
analyses, for a beneficial mutation to fix, it must establish and then not be interfered with 
by a more-fit mutation before it fixes. The probability that a mutation of size s will be 
interfered with is pi(s) ~ ^p-lniV rp(r)dr. Thus the putative distribution of beneficial 
mutations that fix will be Pf(s) = Kse~ x ^ p(s), where K is a normalizing constant. The 
average effect of a fixed beneficial mutation — effectively s C i — would be the mean, (s)p, 
of this Pf(s). These mutations arise at average rate (k)p = NUbPfix, where Pfi X is the 
average probability of fixation, Pf ix = J °° se~ x ^ p(s)ds. Clonal interference analysis yields 
v = (s)p(k)F- For our 3-class example, with iV = 10 5 , this gives v tat ~4x 1CT 5 , about 
3 times higher than the maximum possible as calculated from vtot = v s + v m + vi. For 
N = 10 6 , clonal interference predicts v tot k4x 10 -4 , about 20 times too high. The problem 
is easy to diagnose. For both values of N, the clonal interference theory correctly predicts 
(s)f ~ s m . However, implicit in the calculation of (k)F is the incorrect assumption that 
these medium mutations accumulate singly. Conversely, the predominant mutation approach 
( "max" approximation) is to choose s m as the single value of s, and then analyze how the 
multiple-mutation process sets the rate at which this class of mutations accumulate. 



DISCUSSION 



Beneficial mutations are often assumed to be rare, and adaptation therefore to be 
mutation-limited. This is the basis for the picture of successional select ive sweeps an d 



the conclusion that mutations arise and fix at a rate proportional to NUbS ( jEwENsl 120041 ) 



This picture of successional sweeps underlies the strong selection weak mutation assumption 
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that is essential to many conclusions in population genetics and evolutionary theory. This 
assumption is likely to be correct for the evolution of some strongly selected characters in 
complex multicellular organisms. But most unicellular organisms and viruses tend to live 
at much larger population sizes, and can have larger mutation rates. For such populations, 
much of one's intuition from the rare- mutations picture will often be wrong. This makes it 
important to go beyond the successional-mutations regime and to develop an understanding 
of evolutionary dynamics when beneficial mutations are common. 

This is a very broad subject. In this paper, we have focused on the concurrent-mutations 
regime in which there is strong selection and strong mutation. By strong mutation, we mean 
that the total beneficial mutation production rate NUb is sufficiently large that the time to 
establish a mutant population is less that the time it will take to sweep to fixation. As 
the establishment time is l/(NUbs) and the sweep time is MniVs, the condition to be in 
the concurrent mutations regime is NUb ~ i n []vs] ' so ^ na ^ mu ltiple beneficial mutations are 
present in the population and tend to interfere. By strong selection, we mean both Ns 3> 1 
and ^ ^> 1. The former condition is what is commonly meant by strong selection, and is 
required to ensure that selection is strong compared to drift except when subpopulations 
are rare. The latter constraint makes the analysis simpler, because it ensures that only one 
population at a time needs to be treated stochastically, but is not essential for the general 
picture. 

The concurrent-mutations regime that we analyze is likely to be quite common in nature. 
Even if there are only ten or so beneficial point mutations available to a population which 
has a per base pair mutation rate of order 10" 9 , this gives Ub ~ 1CT 8 . To have NUb ~ \ n [N S ] ' 
we therefore only need population sizes of order 10 7 (1/ In [iVs] will typically be roughly 

for any reasonable values of s in such large populations). In other words, if there are 
even a few mutations of effect s ^> 1CT 7 available, a population as small as 10 7 individuals 
will experience the multiple concurrent mutation effects. These sizes are well within normal 
ranges for many populations, including, for example, E. coli in a single human gut, cells in an 
evolving cancer, pathogens within a single host, and many others. Moreover, this is a very 
conservative estimate. Viral and certain bacterial populations, or mutator strains in any 
organism, often have much overall higher mutation rates. Organisms with more beneficial 
mutations available will also have much larger Ub- In recent experiments in S. cerevisiae 
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adapting to low glucose, we have inferred a beneficial mutation rate of Ub = 1 in non 



mutator strains and an order of magnitude higher in mutators (IDesai et all , 120061 ). Such 



values are not atypical (IJOSEPH and HallI . I2004J ). For these values of Ub, and s of order 
a percent or fraction of a percent, a mutator population of iV ~ 10 7 will have q ~ 4, so 
that quadruple-mutants will be present and sweep collectively (for nonmutators, q ~ 3). 
With these parameters, each factor of about ten increase in iV will increase q by one. In 
general, we see that the concurrent-mutations regime is surely relevant for many microbial 
populations. 

Within the concurrent-mutations regime, we have explored how a population accumulates 
beneficial mutations and maintains variation in fitness. The fundamental theorem of natural 
selection states tha t the ra t e of i ncrease in the mean fitness of a population equals the 
variance in fitness (IFisherL Il930l ). This remains true. Our work demonstrates how the 
variance is itself determined: how fitness variation accumulates while it is being selected on. 
The key here is the balance between selection narrowing the fitness distribution and mutation 
broadening it. This is an unusual type of mutation- select ion balance, very different from the 
deleterious case. Only mutations at the nose of the distribution matter. Others inherit a less 
good genetic background and do not contribute to the long term evolution of the population: 
they are destined to be outcompeted by new mutations at the nose. The dynamics at the 
nose, where subpopulation sizes are small, dominates the behavior. This means that the 
natural measure of the width of the distribution is the lead, not the variance — in contrast 
to conventional treatments. It also means that random drift and finite N effects are crucial, 
even for arbitrarily large iV as long as there are more than a few beneficial mutations to 
be acquired. Thus for any treatment of evolution in fitness "landscapes," these effects need 
to be taken into account whenever the population is not localized around a fitness peak: 
in contrast to quasi-species equilibria near fitness peaks, deterministic approximations give 
nonsense. 

By matching the speed of advance of the nose with the speed of advance of the bulk of the 
distribution, we have shown that the lead depends logarithmically on N and Ub according 
to the formula 



q 



21n[iVgs] 



In 



(72) 
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This leads to a speed of evolution that is also logarithmic in N and [/&, 



2\n[Ns] 


-In 


s 


In 2 


s 





(73) 



Our work extends and complements ear l ier w ork on the concurrent-mutations regime. 



Kessler et al. 



(119971 ) and 



RlDGWAY et al. 



(119981 ) studied a model like ours, although their 
initial work did not properly account for all stochastic effects. Recently they have developed 
a mome nt-based approach which pro vides results qualitatively similar to ours in certain 



regimes (IKessler and LevineI , 



20031 ). This is a potentially useful technique, although, as 



discussed in Appendix A, it quickly becomes unwieldy as more moments need to be kept, 
an d numerical ana l ysis is required. 



Rouzine et al 



(120031 ) also studied a model similar to ours in the context of HIV evolu- 
tion. Their analysis also involves a separation between deterministic and stochastic behavior, 
but treats the stochasticity a t the nose in a di f ferent and less explicit manner. To couple this 

(120031 ) appear to require a smoothness in the 



to the deterministic results, 



Rouzine et al. 



fitness distribution which would obtain only when it is broad. Thus their analysis is strictly 
valid only at what we would call very high speeds: v ^ s 2 . But, because they treat only 
one population stochastically at a time, their analysis also requires ^ < 1, so their results 
are valid only at enormous population sizes (and very large q > ln(s/C/&)). This regime is 



likely to be re 
the results of 



evant for certain vim 



Rouzine et al 



populations, which was their main focus. Nevertheless, 
(120031 ) are similar to ours, in that they involve logarithms of 
Ns and in similar ways (though they do differ substantially — in the regimes we have 
considered their results lead to errors typically ranging from plus or minus 50% to 250%). 
This is unsurprising, since the simple beneficial mutation-selecti on balance arguments (in 

Rouzine et all (120031 ) as 



our heuristic analysis section) apply to the very fast regime of 
well, and lead generally to logarithms of Ns and - L . Further analysis shows, if some alge- 
braic errors are corrected in their work, and the large Ns and s/Uj, asymptotics are worked 



out, that our result 



logarithms (IRouziNE 



or v c an be recovered up to somewhat different factors inside large 



2006( 1. 



Various studies have been carried out on clonal interference - - the other effect that 
occurs w hen there are concurrent mu t ation s (i.e. in the strong selection strong mutation 



regime) (ICampos and De OliveiraI . 



2004 



Gerris 



„ 



2001 



Gerrish and Lenski 



1998 : 
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Johnson and Barton, 


2002; 


Kim and Stephan. 


2003; 


Orr. 


2000: 


WlLKE. 


2004) 



have discussed the relationship between this work and ours and analyzed a model with a 
distribution of beneficial mutatio ns wh ich includes both clonal interference and multiple 



mutation effects. IKim and OrrI (120051 ) have also analyzed some of the interplay between 
these effects. Clonal interference analysis by itself makes qualitatively similar predictions 
about the rate of accumulation of beneficial mutations as the full th eory. Both predict that 



v grow s m uch less than linearly in N and C/& (as do the analyses of 
(boOJ) and 



Kessler and Levine 



Rquzine et al. 



(120031 )). though the quantitative predictions differ. The major 
qualitative differences are in the mechanisms by which the evolution takes place. In clonal 
interference analysis large mutations that occur in individuals which have roughly the mean 
fitness — i.e in the majority subpopulation — dominate the evolution. Thus one would 
expect to see strong selective sweeps and a population that is typically either nearly clonal 
or in the midst of such a sweep (except occasionally when a smaller mutation becomes 
transiently very common before being outcompeted by a larger one.) By contrast, except 
when there is a long tail to the distribution of s, we have shown that the evolution is 
dominated by multiple mutations of intermediate effect, so the selective sweeps are much 
less pronounced and the population always maintains substantial variation in fitness. And 
we have shown that even when the distribution of s does have a long tail, some of the 
quantitative predictions for the speed of the evolution are different from clonal interference 
predictions. We find that the mutations that dominate the evolution in the concurrent 
mutations regime have strengths in a narrow range around some predominant value, s. The 
simple single-s model is thus surprisingly good, provided we use s = s and £/& equal to the 
mutation rate to beneficial mutations of this magnitude. 



Over the past few years, much experimental eviden ce has accumu 
prediction that v grows less than linearly in N and Uh 



1999 



de Visser and Rozen 



2005 



Colegrave. 



Miralles et al. 



1999 



ated which supports the 



2002 



de Visser et al. 



20001 ). This has often been 



interpreted as support for the clonal interference picture. However, the experimental data 
on the quantitative details of the dependence of v on iV and Ub cannot distinguish between 
clonal interference analysis and our results. Thus these experiments also support our general 
theory. 

We have recently (in collaboration with Andrew Murray) conducted experiments on asex- 
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ual evolution of yeast in low glucose. For a range of different N and Ub we measured the 
distributions of fitness es within the evolvi ng; populations and the dynamics (v (i)) by which 



the fitness increased ( Desai et al. 



20061 ). Since, unlike earlier work, these experiments 



measured the widths of the fitness distributions and the strengths of selective sweeps, we 
were able to distinguish between our analysis and clonal interference acting alone. The 
experimental data support the multiple-mutation theory, with both v and the leads of the 
fitness distributions depending on N and Ub consistent with our predictions. Clonal inter- 
ference analysis, on the other hand, would predict that populations maintain less variation 
in fitness, and that this variation would not scale with N and Ub as we predict. We also 
measured how the populations increased in fitness over time, finding smooth increases sug- 
gestive of multiple mutations of intermediate size fixing together. This was again consistent 
with our theory and inconsistent with clonal interference alone, which would suggest rare 
larger mutations dominate the evolution. Combining all these data, we found that clonal 
interference was ruled out unless several parameters were finely tuned. Thus in the only 
experimental test able to distinguish the two effects, multiple mutation effects explain the 
data better than clonal interference alone. This represents only one set of experiments in 
one organism in one selective condition, so it is quite possible that in other circumstances 
the reverse will be true. Yet, even if clonal interference is found to better characterize the 
dynamics in some situations, we have shown that to understand this properly one needs to 
analyze the interplay between this and multiple beneficial mutations on the same genome. 

Despite being consistent with one experimental test, the model we have analyzed surely 
has many shortcomings. We have analyzed one of the simplest possible situations for posi- 
tive selection. Violations of certain simplifying assumptions, such as neglecting deleterious 
mutations and assuming a single effect s of beneficial mutations, may well, as we have ar- 
gued, have relatively minor effects beyond modifying the effective parameters Ub and s of the 
model. Furthermore, the neglect of interactions between effects of mutations (epistasis) may 
not invalidate the overall results. The key assumption is that the distribution of the mag- 
nitudes of available beneficial mutations is roughly independent of the genetic background 
even though the actual set of these mutations varies. That is, after each uphill-fitness step 
is taken, the distribution of possible next steps is similar, although they may now be in 
different "directions" . 
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However, breakdown of some of our assumptions will surely be crucial. For example, 
certain non-multiplicative (epistatic) effects of beneficial mutations, as well as frequency- 
dependent selection, can lead to very different behavior. But our results should serve as 
a null model, useful in forming baseline predictions. Departures from the main results - 
especially the scalings with population size and mutation rates — indicate the presence of 
one or more complicating factors. Even within the context of our simple model, however, 
many important questions remain. 

One of these is the expected genetic variation. We have calculated the expected variation 
in fitness, but individuals with the same fitness will often have different sets of beneficial 
mutations. Thus the true genetic diversity at the positively selected sites will be substantially 
greater than the variation in fitness. Although sometimes the first new mutant to establish 
will dominate the lead population, typically around q different beneficial mutations will 
occur and contribute to extending the nose during one establishment. Subsequent mutations 
that further extend the nose will occur at random among these different backgrounds, thus 
changing (and typically reducing) this diversity, even as the diversity of the new mutations 
is created. Eventually particular beneficial mutations do sweep, but these sweeps are not 
necessarily uniform. Instead, frequencies typically go up and down depending on which 
backgrounds future mutations occur in. Understanding this diversity is important if one is 
to look for the signature of this type of selection in sequence data. It is also important to 
understand the potential benefits of sex, as we discuss below. 

In addition to the diversity at the positively selected sites, we would also like to under- 
stand the expected patterns of variation at neutral and deleterious sites. This will have a 
very different character than in neutral evolution or in the successional-mutations picture 
of positive selection, and may also help detect concurrent-mutations evolution in sequence 
data. The neutral, deleterious, and beneficial diversity is also important in in understanding 
the role of epistasis. If potential beneficial mutations have epistatic interactions with other 
mutations, the typical variation in the presence of other mutations is crucial. 

Another important question is the effect of sex or recombination in a population in 
the concurrent-mutations regime. According to the Fisher-Muller hypothesis, sex should 
reduce interference effects and hence prevent the wasting of beneficial mutations. This 
allows sexual populations to accumulate beneficial mutations faster than asexual ones. 
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Crow and KimuraI (119651 ), IbodmerI (119 701 ). and Imaynard Smith (119711 ) attempted to 



calculate the strength of this effect by comparing the v in an asexual population to the v in a 
population with free recombination. They defined the advantage of sex to be the difference 
between these quantities. However, their calculation of the asexual v assumed that only two 
beneficial mutations were possible — thus ignoring triple and higher mutants, and not prop- 
erly accounting for the competing effects of mutations and selection. With our calculation of 
the asexual v , however, one can make this comparison. In the completely free recombination 
case, all beneficial mutations behave independently: there is no interference between them, 
nor collective behavior among them. Thus with free recombination, v/ r = NUbS 2 , as in 
the successional-mutations regime. This gives a huge Fisher-Muller advantage to sex, the 
difference between v and Vf r , which is zero in small populations and grows rapidly as iV or 
Ub increases. 

However, the above analysis is not directly applicable to the evolution of sex, since sex 
and completely free recombination are certainly not synonymous. Rather, sex may only 
occur occasionally or recombination might be infrequent, so that linkage persists for some 
time. An interesting situation is when sex and recombination are relatively rare. We would 
like to understand whether or not a small amount of sex in an otherwise asexually evolving 
population would be advantageous (and hence be likely to become more common). To do so 
within the simplest model for the asexual evolution, we must first calculate the true genetic 
diversity among beneficial mutations within all the subpopulations at different fitnesses. 
Given this, we can then calculate the probability that sex between any two individuals will 
produce more-fit offspring. 

This is a subtle question, because the average effect of sex on the variance in fitness or 
in the tendency to bring together good mutations more than it breaks them up is largely 
irrelevant. Rather, what is important is the rate at which recombination generates (or 
eliminates) anomalously fit individuals — that is, its effect on the nose. Sex will tend to 
break up beneficial mutations at the nose, and hence tend to destroy some of the most-fit 
individuals. At the same time, however, it will occasionally mix two less-fit individuals in 
just the right way to create an offspring which is more fit than the current nose. It is the 
competition between these two effects that determines the advantage of sex. Even if sex on 
average tends to increase the variance in fitness, this will not increase the speed of evolution 
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in the long term if it does not also extend the nose. Rather, the increased variance from sex 
will be balanced by the actions of selection and mutation (in the end, the mean fitness cannot 
advance any faster than the nose), and the rate of adaptation will be largely unchanged. 
On the other hand, if sex does extend the nose it will tend to speed up the evolution even 
if it has little effect on the variance. In this case, these occasional sex-driven expansions of 
the nose would act like extra mutations, which modify the mutation-selection balance and 
cause an increase in the steady state variance via increasing the lead — even though sex has 
no direct effect on the variance. 

In recent years, Otto and Barton have made substantial progress in understa nding the 



effects of sex, short of comp l etely f ree recombination, in t 



1995 



Barton and Otto 



2005 



Otto and Barton 



ie Fi s her- M uller picture ( IBarton 



1997 



200 ll ). This work takes the 



Hill-Robertson perspective and does not include the full dynamics of the asexual population 
that we have worked out here. As far as we are aware, it is not clear whether the effect 
of sex at the nose within our calculated population structure is the same as the effect of 
sex in Otto and Barton's analysis. Future work is needed to unify these perspectives and 
understand the effects of sex even within the simplest models. 

In this paper, we have explored evolutionary dynamics when beneficial mutations are com- 
mon and there are many present concurrently. We have laid out an analytical and conceptual 
framework for understanding how asexual populations accumulate beneficial mutations - 
the dynamics of adaptation in this extremely basic situation. Using this framework, we 
have demonstrated that the rate at which a population accumulates beneficial mutations 
does increases only slowly with population size or mutation rate beyond a certain point. 
Although we have focussed on the effects of multiple mutations, we have also analyzed the 
interplay between this and clonal interference between mutations of different strengths. The 
results have implications for comparing evolution between different populations, and for de- 
signing experiments to investigate various aspects of evolution in the laboratory. Statistical 
tests that can distinguish, based on sequence data, between various scenarios for ongoing 
evolution are needed: our results provide a step in this direction. More generally, our results 
provide a framework for starting to address the effects of sex, of mutators, and of epistatic 
interactions in large populations. 
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Appendix A: Deterministic and Moment-Based Approaches 

There are a variety of other possible approaches to studying the problem we have ana- 
lyzed. In this Appendix, we briefly discuss two of these: deterministic approximations and 
moment-based approaches. Both of these methods start by considering the distribution of 
fitnesses within the population as some function w(x,t), which describes the number of in- 
dividuals at fitness x at time t. As long as s is small, w can be treated as continuous: this is 
equivalent to the conventional "diffusion approximation". The forces of mutation, selection, 
and random drift then lead to a stochastic differential equation which describes the time 
evolution of this distribution w(x,t), 

=[ x - x(t) - U b ]w(x, t) + U b w(x -s,t) + Jw(x,t)C(x, t), (74) 
at v 

where x(t) is the population mean fitness and £ is a gaussian random term but with subtle 
correlations needed to ensure that the fluctuations do not change the total population size 
N = J2 X W ( X )- Studying this equation can then lead to predictions of the speed of evolution, 
maintenance of variation, and other interesting quantities. 

The simplest possible approach is to neglect genetic drift and attempt an "infinite-iV" 
solution to the problem. This deterministic approach is extremely useful in many situa- 
tions, including in understanding deleterious mutation-selection balance. However, when 
considering beneficial mutations, it is essential to account for genetic drift and, crucially, 
the discrete nature of individuals. Fractional numbers of deleterious mutations, implicit in 
the deterministic mathematical analyses that are often appropriate for large populations, are 
of little consequence because they are selected against. But allowing fractional numbers of 
beneficial mutants at the nose yields nonsense because fractional individuals that are highly 
fit multiply and take over the population. Thus even for very large populations, the pop- 
ulation size, which determines the smallest fraction of the total population that represents 
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at least one individual, plays a crucial role. Infinite- N deterministic approximations are not 
even qualitatively correct. 

The problems with the simple deterministic approximation to Eq. (1741) are revealed 
by analyzing the resulting behavior. This shows that the deterministic solution does not 
support a steady state v — rather, it predicts that the speed of evolution accelerates without 
bound. This is clearly unbiological, as it involves a concomitant exponentially increasing 
width of the distribution and thus smaller and smaller numbers in the nose. Except for 
very short times (roughly until the nose develops in the correct analysis), the deterministic 
approximation is thus drastically wrong even for very large N. The source of the problem is 
that each more-fit population grows faster than the one before. Thus early mutants into a 
new more-fit fitness class at fitness x + s grow faster than the population at fitness x. This 
means that even tiny fractions of an individual — certainly non-biological! — will later give 
rise to a large population even without further mutations. Indeed, it is the "descendants" 
of these early fractional mutants that will later dominate the population of individuals at 
fitness x + s, despite the fact that there are more mutants occurring from fitness x. These 
descendants then produce fractional mutants to fitness x + 2s, and the unrealistic aspects 
are further exacerbated. 

An alternative way to study Eq. (!74l is to use a moment-based approach. We can can 
multiply Eq. f!74l by x and integrate to find the rate of change of the first moment of the 
fitness distribution (the speed of evolution) in terms of the second moment (the variance). 
In the limit that mutation is negligible compared to selection in the bulk of the fitness 
distribution, d(x)/dt ~ var(x), simply the fundamental theorem of natural selection. One 
can easily work out that the time derivative of the second moment (the variance) involves 
the third moment. The time derivative of the third moment involves the fourth moment, 
and so on. This moment hierarchy does not close. Even so, this approach can yield accurate 
results for short timescales. The more moments that are kept, the longer the results will be 
accurate for, and if enough are kept the steady state speed of evolution can be calculated 
accurately. The lowest-order version of this is familiar — it corresponds to assuming that 
the variance is given by its value at t — and does not change, and that the speed of 
ev olution is equal to that. 



Kessler and LevineI ( 120031 ) have carried out a sophisticated analysis using a moment- 
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based approach; their work contains a more detailed analysis of the issues involved. Account- 
ing properly for the effects of mutations, stochasticity, discreteness in population number, 
and fixed total population size are very difficult. Thus far, this analysis involves complex 
moment equations which unfortunately provide little intuition and no simple analytic results. 

The problems with moment equations are unsurprising based on our analysis. As we have 
noted, it is the lead qs, not the variance or another moment which is most naturally thought 
of as being maintained by the balance between mutation and selection. This lead is not a 
moment of the fitness distribution — it is instead a measure of its nose, near to which the 
discreteness in population number is crucial. The lead thus represents some combination 
of high moments of the fitness distribution, with the order of the moments that matter 
depending on iV: to capture the effects of the sharp nose of the distribution, at least of 
order 2 In Ns moments are needed, and such high order moments may be dominated by 
rare fluctuations of the lead. It is hardly surprising that getting at the dynamics of the 
lead with a moment expansion is very cumbersome. Our approach, in contrast, handles the 
stochastic issues at the nose in a natural way while simply tracking the effects of selection 
that dominate in the bulk of the distribution. 

Appendix B: Variable N and Effective Population Size 

We have thus far assumed that the population size is constant. We now consider what 
happens when we relax this assumption. 

If changes in N are rapid compared to the changes in the mean fitness, then we can define 
a constant effective population size N e . The definition of N e can be complicated — it is not 
necessarily the geometric mean of the actual population sizes. Rather, N e is the value of 
the constant population size in our model that gives the same stochastic dynamics as the 
the changing N situation averaged over a timescale long compared to the shifts in N. In 
practice, this means that if our variable- A population were clonal, N e U h s must be the time- 
averaged rate at which beneficial mutations would establish. Our theory at constant N is 
then correct provided we use N = N e . Serial dilution protocols are one case relevant to many 
experimental situations. Here, a population grows exponentially for G generations, is diluted 
back to its original size N b , and then this cycle is repeated. The effective population size in 
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this scenario was calculated by lWAHL and Gerrishi ( 120011 ). who found N e = N b G ln(2). 

In the opposite regime where the changes in N are much slower than changes in the mean 
fitness, the lead and fitness distribution adjust quickly enough that the correct steady-state 
behavior for the current N always obtains. This means that we can simply replace the N 
in our results with the time-dependent N(t). 

If the changes in N occur on comparable timescales to the changes in the mean fitness, the 
situation is much more complicated. We cannot define an effective population size, because 
the changes in N are too slow to be "averaged" over. On the other hand, the changes in iV 
are too fast to allow the population to continuously adjust and stay in steady state. Rather, 
the population will often be in a transient regime with a complex dependence on past values 
of N. We do not analyze this case. Though it is an interesting subject for future work, it is 
a special situation which is unlikely to have general importance. 



Appendix C: Running out of Beneficial Mutations 

We have taken the beneficial mutation rate Z7& to be a constant. However, each beneficial 
mutation that establishes is likely to change the total number of beneficial mutations that 
are available. Clearly once an individual has a beneficial mutation, that particular mutation 
is no longer available. But it is also possible that one mutation may open up or close off 
other possibilities. Thus the beneficial mutation rate Ub may change in complicated ways. 

In many cases, Uf, will change slowly with each mutation. Our theory predicts that the 
steady-state value of q at a given Ub is qVUb) = tt^tti • Provided that the change in q(Ub) 

, (q l)s 



in 



Uh 



over q mutations (after which the fitness distribution has moved through its full width) 
is small, then the population is always approximately in the steady state and our theory 
still holds — we simply replace Ub everywhere with the appropriately varying Ub(t). This 
condition holds provided the change in Ub from a single mutation is small enough that q(Ub) 
changes by much less than one. 

When Ub changes rapidly enough with each mutation that this condition is violated, the 
population fitness distribution does not adjust quickly enough to stay in steady state. In 
this case, the population will often be in a transient regime with a complex dependence on 
past values of Ub- This situation can be analyzed with the algorithmic methods described 
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in the section on transient behavior. 

One type of change in Ub is of particular interest: when each mutation that establishes 
is no longer available, but does not open up or close off any other possibilities. We assume 
that there are initially k beneficial mutations, each of which occurs at a rate \i. After i such 
mutations have been established, there are £ = k — i left, and the mu t ation rate is Ub = i/i- 
This situation has been analyzed in great detail by 



Rquzine et al. 



(120031 ). We can get a 



sense of the behavior by substituting Ub = l\i into our formula for q to calculate how much 
q changes after a single establishment. If this is much less than 1, our steady-state theory 
is a good description of the dynamics; we simply use the appropriate (changing) value of 
Ub- Otherwise, the population will often be in a more complicated transient regime. This 
condition corresponds to 
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where qg is the value of q corresponding to Z7 6 = t\i. Since we have assumed that ^> 1, 
this condition will almost always be satisfied, even for very small values of i (i.e. when the 
population has almost reached the fitness "peak"). The only potential complication is that 
if - < 1, then our assumption -jj- 3> 1 may break down for small values of i. 



Appendix D: Fluctuations in r, Variations in v, and Stability of the Steady State 

The establishment time r q is a random variable. Above we calculated the steady state 
assuming that each establishment takes the average establishment time (r q ) . However, there 
are stochastic variations in this establishment time which lead to fluctuations in the speed 
of evolution. These variations could also affect the average v , because the average v is really 
determined by the average effect of variable r q , not the effect of the average r q as we have 
assumed thus far. 

The full distribution P(r q ) is a special function — a change of variable in the one-sided 
Levy distribution P{n q ). However, we can calculate arbitrary moments (r™). The second 
moment is 

(r 2 ) = 



In 2 



s (q — 1) sin(7r/g) 

U h ne-y/i 



+ 



7T 2 2q- 



(76) 
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From this we can calculate the variance in r. 



Var(Tg) 



7T 



1 1 



(77) 
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The relative variation in r q is thus 




For small ^, this is small even for q = 2 and decreases as - for large q. Thus the total fluc- 



tuations in the lead (and the speed of evolution) are small, and ignoring them in calculating 
the average v is reasonable. 

From these fluctuations in r q , we would like to calculate the expected fluctuations in v. 
This would explain how much variation in adaptation we should expect between different 
populations experiencing the same conditions (for example, geographically distinct subpop- 
ulations or different experimental lines). Unfortunately, however, this is a difficult problem. 
This is because successive establishment times are not independent. A shorter than average 
T q immediately increases the lead. This tends to make subsequent establishments shorter as 
well. The opposite is true for longer than average r q . Thus the lead is unstable to fluctua- 
tions in the short term — increasing the lead due to a short r q creates a tendency to further 
increase the lead, and vice versa. This effect is enhanced because a shorter than average r q 
means that the population is less influenced by subsequent mutations, so its size earlier is 
slightly bigger than usual (i.e. t(2t 9 ) is closer to r q than usual). Again, the opposite is true 
for longer than average r q . This short-term instability is checked at later times. A subpop- 
ulation with a short r q is more fit relative to the mean than it would be with an average 
T q . It thus becomes the dominant subpopulation, increasing the mean fitness, more quickly. 
When this happens, the lead is decreased — roughly q establishments after the short r q . 
Thus the various r q are correlated in a complicated way: a short r q tends to favor further 
short T q , until roughly q establishments later when it favors longer r q , and the opposite is 
true for longer than average r q . 

To understand these complications, it is important to consider more carefully the form 
of the distribution of r q , especially for large q. Since £ = lns/C/j, is large, it is convenient to 
define 




(79) 
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with A having both average value and stochastic fluctuations of order unity (and thus small 
compared to £). For small q, the characteristic magnitude of the fluctuations are correctly- 
captured by the variance. The behavior for large q is somewhat more subtle. In this limit, 
the mean value of A is of order 1/q, but its distribution has an interesting form: A is 
typically much smaller than 1/q and is rarely negative, but with probability of order 1/q it 
is positive of order unity. The variance of A is thus of order 1/q as can be seen from the above 
result for Var(r,j) but, in contrast to what one might expect, all higher moments are also 
of order 1/q. The strongly asymmetric form of the distribution of r q has a simple origin: 
there is some chance that an establishment occurs anomalously early, but as the feeding 
population is producing mutants at an exponentially growing rate, it is highly unlikely that 
the establishment will be anomalously late. 

For large q the form of the distribution of r q has implications for the distribution of the 
"sweep" time, t s , until new mutants will dominate the population. This will be t s qr q ~ 
q£/(q — l)s ~ on average for large q. The variations in t s will arise from two sources. 
The first is the sum of the variations of q successive r q s. From the above discussion, the 
sum of q A's will have a distribution with typical and average value both of order unity. 
This will give rise to fractional variations of t s of order 1/qs, which is smaller than the mean 
t s by a factor of 1/ q£. 

But there is another factor that needs to be taken into account: a short r q will increase 
the lead and thus make the next establishment likely to happen somewhat sooner, thereby 
making subsequent ones likely to be even earlier. Until the mean population feels the effects 
of the series of new mutant subpopulations, the lead is thus exponentially unstable. But 
this effect is not large: the deviation from average, u(t), of the speed of the lead, grows 
proportionally to the increase, X(t), of the lead from qs with 

du s . . 

Thus dX/dt = Xs/£, so that for large q in a time t s ~ £/ s, an anomalously large lead will only 
grow further by a factor of e. This means that the effects of the exponential instability of 
the lead are only beginning to be felt before they are counteracted by a sooner than typical 
advance of the mean fitness. The above estimate from a sum of roughly independent A's 
thus correctly gives the rough magnitude of the small variations in t s . But the correlations 
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between successive r q s means that the velocity fluctuations are correlated over times of 
order t s . 

On time scales much larger than t s , the mean fitness y(t) will grow, with the mean speed 
v and diffusive fluctuations around this described by 

([y(t)-y(t')] 2 )^[v(t-t')} 2 + 2D\t-tl (81) 

with the diffusion coefficient inferred from the above to be 



Appendix E: On the Cutoff in the Integral in H and the Pathologies of (n q (t)} 

One initially surprising property of the distribution P(n q ,t) is that it has infinite mean: 
that is, (n q ) = oo. The infinity arises because we have allowed mutations from n q -\ to 
n q to occur arbitrarily far back in the past — even before the establishment of the q — 1 
population (as described in the main text, this was implicit in using — oo as the lower limit 
of integration in the expression for H). Naively, it seems that this is a serious problem, and 
that the solution is to impose a realistic cutoff in time before which mutations are disallowed. 
That is, we could say that before t = t { there is a negligible chance of mutations occurring 
and therefore set the lower limit of integration in H to be ti. This does remove the infinite 
(n q ). However, it does nothing to address the underlying issue. Rather than being infinite, 
we would then have (n q ) depending very strongly on ti. This is biologically unreasonable, 
since the population n q arises from mutations which tend to occur only after n q -\ reaches 
a relatively large size (naively, of order ^-). Certainly the important properties of n q (t) 
should be independent of whether we consider only mutations that occur after n q -i reaches 
one individual versus two individuals, for example. Indeed, since our expression for n q -\{t) 
is not valid at these small subpopulation sizes anyway, for our results to be valid, they had 
better not depend on such early times. 

The solution to this apparent dilemma lies in the fact that the average n q (t) is not an 
important property of the distribution of n q {t). Rather, (n q ) is dominated by events so rare 
that they will never actually occur in practice — namely, when a mutation occurs in the 
subpopulation n q -\ while n q -i is extremely small. The reason for the resulting large (n q ) 
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is that even though mutations are very rare far back in time when n q -\ is small, they have 
a huge effect on the future n q when they do occur and establish. Since the subpopulation 
n q grows faster than the subpopulation n g _i, the very early mutations dominate over later 
ones. This can be seen explicitly. The probability of a mutation from the population n g _i 
at a time £ is ^■e(' 3 ~ 1 ) sto , and if a mutation occurs at that time it will on average lead to 
a lineage that at later time t is of size n q ~ e 9 s (*-*o)_ Thus the contribution to (n q (t)) from 
mutations at time t is of order Uk e -sto e qst_ rj^g dependence on t is as expected. However, 
the dependence on t is such that the smaller t is (especially at large negative to), the larger 
the contribution to (n q ). This average n q is thus dominated by mutations that happened 
very early. The essential point is that although the probability of a mutation decreases 
exponentially at rate (q — l)s as we decrease the initial time t , its effect on n q increases 
exponentially at the faster rate qs. 

But the lower limit of the mutation times is only important for determining the very 
large-n g form of P(n q ,t). This part of P(n q ,t) contains extremely small probabilities of 
extremely large n q , in such a way that all integer moments of n q depend crucially on this 
choice. However, this high-n^ part of P(n q , t) represents such a small total probability that 
it would not occur in any real population. Thus getting P{n q ,t) correct for this high n q 
cannot matter. To get the quantities of interest — in particular (h\.n q (t)) — we can therefore 
use any cutoff we choose, and — oo is a convenient choice. 

The problems with {n q (t)) all stem from the fact that the population grows exponentially 
once mutations occur. Thus it is natural to "factor out" this deterministic exponential 
growth in defining aspects of the distribution P(n q , t) and then focus on the distribution of 
\nn q (t) — qst. This is what our definition of r q accomplishes. The variable r q , as we have 
seen, has none of the problems of (n q (t)) and its distribution is independent of the cutoff we 
choose (except for a tiny and irrelevant tail for very anomalously small r q ). As described in 
the section on the fate of a single mutant, the essential point here is the difference between 
(e x ) and . The former (analogous to {n q )) is very sensitive to the tails of P(X), while 
the latter (analogous to e^) is not. And it is the latter that will determine the mean speed 
and fluctuations around this. 
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Appendix F: Multiple Stochastic Clones and 

Our analysis rests on a separation between deterministic and stochastic dynamics, which 
we used to overcome the limitations of branching process models. Such a separation is 
always possible for Ns 3> 1, as noted above, because nonlinear effects are not important 
when stochastic effects are, and vice versa. However, we have made a stronger assumption: 
that the separation is possible right at the nose, so that only the most-fit subpopulation 
must be treated stochastically but that all other subpopulations are deterministic. This 
is an important assumption, as a full stochastic treatment would involve, for example, a 
double-mutant subpopulation whose size is a random variable sending mutations into a 
triple-mutant subpopulation whose size is also a random variable, and so on. These multiply 
random processes are difficult to understand analytically. 

Fortunately, there is a broad parameter regime in which only the most-fit subpopulation 
is small enough to require stochastic analysis. Two conditions must be met. First, the 
most-fit subpopulation at the nose cannot generate new mutations that are destined to fix 
until it has become large enough that the stochastic effects are negligible. Implicit in this 
condition is the assumption that the most-fit subpopulation can generate mutations that 
are destined to go extinct due to drift. This naively seems reasonable, as mutations destined 
to go extinct due to drift should not matter in the long term. This leads to the second 
condition: a population destined to go extinct due to drift cannot itself generate a mutation 
that will become established — otherwise it does matter after all. Here we consider this 
latter condition. In the next Appendix, we consider the former. 

We begin by studying the dynamics of the lineage founded by a single mutant. Thus 
we are concerned with a stochastic subpopulation with a fitness s (or some ps) greater 
than the mean fitness of the population, evolving by our branching process model starting 
from 1 individual at t — and with no further mutations. We denote the size of this 
subpopulation at time t by n(t). We have already calculated P(n,t), but this quantity 
offers no straightforward ways to understand whether mutations can arise while n is still 
stochastic. 

The expected number of mutations that arise from the mutant lineage is / °° U},n(t)dt. 
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Inspired by this, we define 

roo 

W= / n(t)dt (83) 
J o 

as the "weight" of the mutant lineage. If the lineage becomes established, W will be infinite 
(the nonlinear saturation effects are not part of the branching process). However, if the 
lineage goes extinct due to drift, W is the overall integrated population size. The expected 
number of mutations destined to survive drift, k, that arise from this lineage is therefore 
k = WU b s. 

We can exploit the independence between stochastic lineages (valid because Ns ^> 1) to 
calculate W. The initial mutant that founds the lineage will either die (with probability 
2^) or give birth (with probability ^f). The time T until this happens is exponentially 
distributed with rate 2 + s (i.e. prob [T = t] = (2 + s)e' (2+s ^). If it dies, W is simply T. If 
it gives birth, W is T plus the W of each of the two offspring. We therefore have 

1 X -)- g rw rw 

prob [W — w] — prob [T = w]+- / / dudvprob [T = u] prob [W = v] prob [W = w 

2 + s L 2 + s Jv Jo 

(84) 

Converting to Laplace transforms, we can solve for W to find 



Tsr 2 + z + s - ^z 2 + 2zs + s 2 + 4z /or . 

w = WT7) ' (85) 

where W(z) is the Laplace transform of prob [W — w]. Note that W(z — > 0) = not 1. 
This is because there is a finite probability (roughly s) that the lineage becomes established 
and thus has infinite weight. To focus on the lineages that do go extinct, we simply ignore 
this weight at infinity. 

This form of W(z) is impossible to invert analytically for general w. However, the small- ,2 
behavior controls the dynamics at large w. For w > 1 we have that prob [W = w] falls off 
at least as fast as 

prob [W = w] w — =^ -e~ s2w/4 w~ 3/2 . (86) 

L J 2^(1 + 5) v ; 

Values of w that are larger than roughly are exponentially suppressed. Integrating this 

result, we find that less than a fraction s of the lineages have a weight greater than ^, and 

almost all of these are right at w — -p. This makes intuitive sense. The largest size a lineage 

can reach without establishing is roughly ^. If it does so, it takes roughly ^ generations to 

get to this size and another ^ generations to then go extinct. This is because the dynamics 
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are an approximately neutral process while the lineage size is less than ^ (drift dominates 
selection in this regime), so the classical neutral result applies. During this period its average 
size is roughly so the maximum value w can take should indeed be about -j- — \. The 
chance of the lineage reaching size - is also about s (again by analogy to the classical neutral 
result) and once there it is about as likely to establish as to eventually go extinct. So our 
result that w takes on this maximum value roughly a fraction s of the time also makes 
intuitive sense. 

To assume that mutations destined to establish never arise from a subpopulation destined 
to go extinct, we require k = WUb-s <C s. Note the RHS of this expression is s because there 
are - lineages that go extinct for every one that establishes, and mutations destined to fix 
must be much more likely to arise from lineages that establish. Since the maximum value 
of w is roughly j% and this occurs a fraction s of the time, this translates to the condition 
^ < 1. (Values of w less than are more common, but in sum are still less likely to 
produce a mutation.) Thus we can ignore mutations from stochastic lineages destined to go 
extinct provided 

— « 1. (87) 
s 

We have not yet considered whether mutations can arise in the stochastic period of lin- 
eages destined to survive. We address this question in more detail in Appendix G. However, 
below a size - g the lineages that establish behave similarly to the lineages that are destined 
to reach size - and then go extinct, and above this size the surviving lineages quickly become 
deterministic. Thus we expect that whenever mutations never arise from lineages that go 
extinct, they will also never arise during the stochastic period of lineages destined to survive. 



Appendix G: The r(i) Approximation 

Our method of linking the deterministic behavior of the bulk of the population to the 
stochastic behavior at the nose hinges on our definition of r q . We defined r q as r(t — > oo), 
where r(t) is defined by 

n q (t) = —e qs ^- T \ (88) 
qs 

The variable r(t) is just a change of variable from n q (t). From its definition, we see that 
r(t) is the time at which the subpopulation would have reached size ^ had it always grown 
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exponentially at rate qs until reaching size n q (t) at time t. Thus r(t) accounts for all the 
incoming mutations and stochastic behavior up to time t and allows us to summarize it 
by saying n q reached size ^ at time r(t) and was deterministic thereafter. The definition 
of r q as r(t — > oo) thus summarizes all the random behavior and all incoming mutations 
into a time the subpopulation would have reached size ^. Yet this is not actually the time 



the subpopulation reached size — (Fig. 5). It could, for example, have reached earlier 
than this but by chance grown slower than e qst for a while thereafter. Despite this, we have 
assumed that the subpopulation did in fact reach size ^ at its establishment time in defining 
its size thereafter. That is, we have written n q -i(t) = \^ q ~ 1 , defining t = to be the 
establishment time of this population. And we use this form of n q -i(t) in calculating how 
many mutations this subpopulation generates. 

In order for this to be reasonable, our form of n g _i(t) must be accurate once this pop- 
ulation becomes large enough that it starts generating mutants. This happens roughly r q 
generations after n q -% became established (by definition, it takes roughly r q generations for 
the next mutations to occur, because r q is dominated by the waiting time for the first mu- 
tation to occur). Thus for our result to be accurate, t(2t 9 ) must be approximately r q (to 
be precise, we require t(2t ? ) — r q <^ -). That is, there must not be much stochasticity after 
the population is large enough to generate mutations (and additional incoming mutations 
must be negligible). Looked at another way, this means that the population cannot generate 
mutations while it is stochastic. 

To calculate r(2r q ), we return to our solution H((, t) for the Laplace transform of P(n q , t). 
The time-dependence of r is hidden in Eq. (I28j) — our assumption that ( is small here 
assumes we are interested only in larger n q and is thus equivalent to taking t — > oo. We can 
do this integral more carefully; the result involves hypergeometric functions. These can be 
expanded for £ -C qs but nonzero, corresponding to values of n q (t) larger than i but before 
this subpopulation generates mutations. We find 



H((,t) = exp 
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c 1/q 



(89) 



sin(7r/g)(gs) 1 l / q s 

Unfortunately, this form of H is more complex and we cannot exactly compute r(t). How- 
ever, we can find typical values of r(£) and r q from this by the same methods as before. 
We can also compare the size of the second term in H (which gives the time dependence 
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in r(t)) to the first for values of ( ~ <? s ir> which corresponds to n q ~ the time this 

subpopulation begins to generate new mutations. Both calculations demonstrate that our 
approximation is valid provided that 

— < 1. (90) 
qs 

This result can be confirmed with a deterministic analysis. About r q generations after 
becoming established, a subpopulation has a size n = ^ ■ Once it has reached this size, 
selection dominates drift and mutations. Thus subsequent random or mutational events will 
not significantly affect n, so t(2t q ) and r q = r(oo) are similar. 

Thus whenever ^ < 1, our method of linking together stochastic and deterministic 
dynamics is reasonable. Populations never generate mutations while they are stochastic, 
and hence we are justified in using a deterministic approximation for all but the most-fit 
population. When this condition fails, we must treat multiple populations stochastically 
and the analysis becomes much more complex. We could still divide up the population into 
a nonlinear deterministic part and a linear stochastic part (provided only that Ns ^> 1), 
but the stochastic part would have to include multiple subpopulations. 



Appendix H: Approximations in the Behavior of q 

In our analysis to this point, we have assumed that the mean fitness ys changes abruptly, 
increasing by s every r q generations. We used this assumption in calculating q and it is the 
reason why we have a constant q. In this Appendix, we discuss this approximation. 

There are two important time scales that determine the relative sharpness of the 
changeovers from one dominant population to the next and, concomitantly, from the lead 
population growing with rate qs to rate (q— l)s. Because the second largest population grows 
at rate s, the time scale for this changeover is 1/s. But the time between such changeovers 
is r q . The ratio of these is 

w = — = - 2 , (91) 

ST y S 2 

which is l/ln(s/C4) and thus small at the crossover from the successional- to the multiple- 
mutations regimes. Indeed as long as w C 1, the changeover is relatively sharp on the scale 
of r q and it is a good approximation to consider it abrupt, as we have done. 
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We can make this more precise by computing the actual behavior of the mean fitness as 
a function of time. Assume (for convenience) that at t — 0, the mean fitness is at y — — |, 
i.e. in the middle of a changeover. The subpopulations at y — — 1 and at y = are equal in 
size, and those at other values of y are smaller by a factor of e~^>k=i ksT i = e 2 £ [(f +1 / 2 ) 2_1 / 4 l. 
For small w, the one or two largest subpopulations strongly dominate, as these factors are 
all very small. This is because the variance of the fitness in the population in the multiple 
mutations regime is simply v, since the dynamics of the bulk of the population is controlled 
by selection. Thus the standard deviation is smaller than s, making the other subpopulations 
far smaller than the dominant one. The parameter w is simply the variance in units of s 2 . 

At future times, the subpopulations all grow (or shrink) exponentially at a rate ys reduced 
by the mean fitness (but we can neglect the mean fitness in this calculation because it affects 
all subpopulations equally). To keep the total population fixed thus requires that the mean 
fitness be 

l^y=-q e 

We can perform these Gaussian sums by the Poisson resummation formula to yield 

+oo 



y(t) =vst-^ + d/d(st) { In 



k=— oo 
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If w were large, the k — term would dominate, and the k — ±1 yield relative variations in 
the speed 

1 - 87T 2 we- 2n2w cos{27cwst) + 0[e'^ w ) (94) 



v 

and corresponding variations in y(t) —t/r q that are a smaller by a factor of l/{2n). Thus in 
practice the parameter that needs to be large for y(t) to increase smoothly is 2n 2 w. Only for 
w < 0.2 do the variations in v become more than a factor of two, and substantial deviations 
of y(t) from smooth occur only for w < 0.1. Above this, our abrupt-transition approximation 
is not valid, but despite this our earlier results are still good; we discuss this below. 

The parameter that we have taken to be small throughout is 1/ ln(s/£4). This is the value 
of w at the crossover from successional- to multiple-mutations regimes. Strictly speaking, 
this means that w is small until In iVs ~ (In s/Ub) 2 . For even larger population sizes, the 
behavior near the nose changes somewhat, as discussed in the main text. 

When w is small enough that the shifts in y are abrupt, the dynamics can be worked 
out more generally than we have done in the main text. There, we approximated the most- 
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fit deterministic subpopulation to be growing as e^ q ~^ st for r q generations, after which y 
increases by 1 and the subpopulation growth slows to e (i- 2 ) st ^ and so on. This is strictly 
valid only for integer q. When the naive value for q, 2L/£, is non-integer, the populations 
shift between growth rates some fraction of the way between one establishment and the next. 
The effects of this can be taken into account straightforwardly as long as the shift between 
growth rates is indeed abrupt on the scale of r q : i.e. that w is small. Here we ignore factors 
inside logarithms: to get these one would need to use the fuller analysis of the feeding and 
lead population dynamics used in the text. For our purposes here, the heuristic derivation 
of the establishment times is sufficient. 

It is convenient to keep q an integer, with qs the growth rate of the lead population when 
it first becomes established. We then define a non-integer generalization of q to be q with 

q = Gl(q) , (95) 

the greatest integer less that or equal to q. It is q that is simply related to the population 
parameters via 

q « 2L/£ , (96) 
i.e. what was previously found for q. The dimensionless speed is found to be 

-Mr 

which is equal to the result in the text, w ~ (q — l)/£, for integer q. The difference 
between these is small for large q, with the fractional error of the simple result (which is an 
overestimate) largest at q — q + |, where it is only l/4q(q — 1) and thus small even for the 
worst case q = 2.5, q — 2. 

In the opposite case where w is not small, the approximation of abrupt shifts in y is not 
valid. In this case, we can make the opposite approximation that the mean fitness increases 
at a uniform rate: from the above discussion, this is valid unless w is quite small (although 
strictly speaking this is not true in the limit that £ is large with fixed q). In the constant 
mean-speed approximation, one obtains 



w w v — £ , (98) 
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which is an underestimate that is worst at integer q: for large q the worst fractional error is 
l/4(g — l) 2 . Since this is small compared to the speed, the approximation in the main text 
is reasonable. 

We can get an intuitive understanding of why this approximation of abrupt shifts in 
y gives reasonable results, even when y actually increases smoothly. First we consider 
the deterministic dynamics of the bulk of the fitness distribution. Here the shape of the 
distribution (and hence the identity of the most common subpopulation) depends only on 
the relative growth rates of the subpopulations, so assumptions about y are irrelevant. For 
the stochastic behavior at the nose, our assumption is more problematic. When the mean 
fitness in fact increases steadily, rather than jumping by s every time an establishment 
occurs, our calculated lead q gives the correct average mean fitness over the stochastic period. 
This means we calculate the stochastic dynamics assuming the correct average mean fitness, 
but this is slightly different from the stochastic dynamics given the changing mean fitness. 
Essentially, we have used q = 3.4, for example, as an interpolation for the correct behavior 
when the lead is just below 4 immediately before an establishment, declining gradually to 
below 3 shortly before the next establishment. Rather than calculate T3.4 from the stochastic 
behavior while the lead shifts correctly, in the main text we have calculated it based on a 
constant lead of 3.4. As we have seen above, however, the difference is small. 

We conclude with some comments on the stochastic aspects of the speed of the nose. 
These make the above analysis questionable because of the assumption of deterministic 
establishments of the lead populations. But, as we discuss in Appendix D,the variations 
in the establishment times are at worst of order l/qs compared to the mean r q which is of 
order ln s j^ h (for large q they are even smaller than this, as discussed in Appendix D). Thus 
the variations in the time intervals between takeovers of the population by new dominant 
subpopulations is small compared to the time intervals themselves. Hence the deterministic 
approximation for the increase of the mean fitness is good, at least as far as its effects on 
the dynamics of the lead populations. 
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Figure 1: 

For beneficial mutations to be acquired by a population, they must both arise and fix. (a) 
A small asexual population in the successional-mutations (or strong selection weak mutation) 
regime. Mutation A arises early on. Provided it survives drift, it fixes quickly, before 
another beneficial mutation occurs. Some time later, a second mutation B occurs and fixes. 
Evolution continues by this sequential fixation process, (b) A larger population in the 
concurrent-mutations (strong selection strong mutation) regime. A mutation A occurs, but 
before it can fix another mutation B occurs and the two interfere. Here a second mutation, 
C, occurs in an individual that already has the first mutation A and these two begin fixing 
together, driving the single-mutants to extinction. This dynamics continues with further 
mutations, such as E and F, occurring in the already-double-mutant population. The key 
process is how quickly mutations arise in individuals that already have other mutations. 
This picture has elements of both clonal interference and multiple mutations, illustrated 
separately in (c) and (d). (c) The clonal interference effect in large populations: a weak- 
effect beneficial mutation A occurs and begins to sweep, but is outcompeted by a later but 
more-fit mutation B, which in turn is outcompeted by mutation C. C fixes before any larger 
mutations can occur; the process can then begin again. Multiple mutations are ignored here, 
(d) The multiple mutation effect: Several mutations, A, B, and C, of identical effect occur 
and begin to spread. Mutant lineage B happens to get a second beneficial mutation D, 
which helps it sweep, outcompeting A and C. Eventually this lineage gets a third beneficial 
mutation E. Mutations that occur in less-fit lineages, or those that do not happen to get 
additional mutations soon enough (such as BDF), are driven extinct. 
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Figure 2: 

The simple positive selection model we study. A large number of beneficial mutations 
are possible, each of which increases the fitness by the same amount s. Thus the population 
climbs a long fitness "hill." All mutations are taken to have the same effect on fitness, and 
the supply is assumed to be large enough that they are not depleted. Epistatic interactions 
and deleterious mutations are both ignored. 
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Figure 3: 

Schematic of the evolution of large asexual populations, (a) The population is initially 
clonal. Beneficial mutations of effect s create a subpopulation at fitness s, which drifts 
randomly until after time T\ it reaches a size of order -, after which it behaves determin- 
istically. (b) This subpopulation generates mutations at fitness 2s. Meanwhile, the mean 
fitness of the population increases, so the initial clone begins to decline, (c) A steady state 
is established. In the time it takes for new mutations to arise, the less fit clones die out 
and the population moves rightward while maintaining an approximately constant lead from 
peak to nose, qs (here q — 5). The inset shows the leading nose of the population. Note the 
logarithmic scale of the populations. 
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Figure 4: 

Schematic of a typical fitness distribution on a logarithmic scale. The total population size 
is large: Ns » 1. At the front of the distribution — the nose — where only a few individuals 
are present, stochastic effects are strong but nonlinear saturation is not. The reverse is true 
in the bulk of the distribution. Stochasticity is strong only when a subpopulation size n 
is small, n < i, and saturation is strong only when a subpopulation size is large, n ~ N '. 
Thus there is a wide intermediate regime where neither matters. We can therefore use a 
nonlinear deterministic model in the bulk of the distribution, a linear stochastic model at 
the front, and match the two in the intermediate regime where both are valid. The bulk of 
the distribution is dominated by selection, which gives rise to a steady-state gaussian shape 
except near the nose. 



88 



Figure 5: 

(a) The definition of the establishment time r est . A single mutant individual is assumed 
to exist at t — 0. It drifts stochastically until it either goes extinct, or eventually gets large 
enough that it grows exponentially and its behavior becomes roughly deterministic. We 
define r est to be the inferred time at which the population would have reached size - if one 
extrapolated backwards from the long-time deterministic behavior. Note that r est is not the 
time the population actually reached size ^ (indeed, r est can be negative), (b) The definition 
of T q : the time between successive establishments of the lead population with fitness qs more 
than the mean. Mutations occur with a rate that grows exponentially with time. Here, r q 
is the time the new lead population would have reached size extrapolating backwards 
from its long-time deterministic behavior. This includes both the time to generate a mutant 
destined to establish and the time for it to drift to substantial frequency (of order l/qs). 
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Figure 6: 

Comparisons between simulations and our theoretical predictions for the mean speed of 
adaptation v (measured in increase in fitness per generation, xlO 5 ). (a) Speed of adaptation 
v versus logxo[iV] for [/& = 10~ 5 and s = 0.01. Both the large N (Eq. (|43|) ) and the moderate 
N (Eq. (1491) ) theoretical results are shown in their regimes of validity, which are above and 
below iV « 1/Z7& respectively (the crossover between the two regimes is indicated in the 
figure), (b) v versus logio[C4] for N = 10 6 and s = 0.01. (c) v versus logio[s] for = 10 6 
and U b = 1Q- 5 . 
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Figure 7: 

Comparisons between simulations and our theoretical predictions for the mean q. (a) q 
versus logi [iV] for U b = KT 5 and s = 0.01. (b) q versus \og 10 [U b ] for N = 10 6 and s = 0.01. 
(c) q versus logi [s] for = 10 6 and U b = 10~ 5 . 
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